Paper Reading AI Learner

Thermodynamic Limits of Physical Intelligence

2026-02-05 09:12:43
Koichi Takahashi, Yusuke Hayashi

Abstract

Modern AI systems achieve remarkable capabilities at the cost of substantial energy consumption. To connect intelligence to physical efficiency, we propose two complementary bits-per-joule metrics under explicit accounting conventions: (1) Thermodynamic Epiplexity per Joule -- bits of structural information about a theoretical environment-instance variable newly encoded in an agent's internal state per unit measured energy within a stated boundary -- and (2) Empowerment per Joule -- the embodied sensorimotor channel capacity (control information) per expected energetic cost over a fixed horizon. These provide two axes of physical intelligence: recognition (model-building) this http URL (action influence). Drawing on stochastic thermodynamics, we show how a Landauer-scale closed-cycle benchmark for epiplexity acquisition follows as a corollary of a standard thermodynamic-learning inequality under explicit subsystem assumptions, and we clarify how Landauer-scaled costs act as closed-cycle benchmarks under explicit reset/reuse and boundary-closure assumptions; conversely, we give a simple decoupling construction showing that without such assumptions -- and without charging for externally prepared low-entropy resources (this http URL memory) crossing the boundary -- information gain and in-boundary dissipation need not be tightly linked. For empirical settings where the latent structure variable is unavailable, we align the operational notion of epiplexity with compute-bounded MDL epiplexity and recommend reporting MDL-epiplexity / compression-gain surrogates as companions. Finally, we propose a unified efficiency framework that reports both metrics together with a minimal checklist of boundary/energy accounting, coarse-graining/noise, horizon/reset, and cost conventions to reduce ambiguity and support consistent bits-per-joule comparisons, and we sketch connections to energy-adjusted scaling analyses.

Abstract (translated)

现代人工智能系统在显著的能力上以大量的能源消耗为代价。为了将智能与物理效率相连接,我们提出了两种互补的每焦耳比特度量标准,在明确的会计惯例下:(1) 每焦耳热力学同构——在一个声明边界内测量能量单位中,代理内部状态中新编码的关于理论环境实例变量的结构信息位数; (2) 每焦耳赋权能——在固定时间范围内预期能源成本下的身体感觉-运动通道容量(控制信息)。这些提供了物理智能的两个轴:识别(建模)和影响力。 基于随机热力学,我们展示了如何在一个标准热力学学习不等式的基础上,遵循明确的子系统假设,获得同构获取的一个兰道尺度闭环基准;同时,我们在明确重置/再用和边界封闭假设下澄清了兰道尺度成本如何作为闭合循环基准的作用方式。相反地,在没有这些假设的情况下——在没有对跨越边界的外部准备好的低熵资源(如内存)收费的情况下——信息增益与内部耗散并不一定紧密相关,我们给出了一个简单的解耦构造证明。 对于潜在结构变量不可用的实验场景中,我们将操作意义上的同构性与计算有限MDL同构性和压缩收益替代品对齐,并推荐以这些指标作为同伴进行报告。最后,我们提出了一种统一的效率框架,该框架同时报告这两个度量标准,并辅以一个简明的边界/能量会计、粗粒化/噪声、视界/重置和成本惯例清单,以减少歧义并支持一致的每焦耳比特比较,并草拟了与能源调整比例分析之间的联系。

URL

https://arxiv.org/abs/2602.05463

PDF

https://arxiv.org/pdf/2602.05463.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot