鲁晨光中文主页 English Homapage Papers on ArXiv Recent papers about Semantic Information and Statistical Learning
2021年8月发表的一篇英文文章——用语义信息G测度解释和推广信息率-失真函数和最大熵分布。下面是中英文对照。
Entropy 是MDPI出版公司出版的一个专业期刊,汇集了全球许多对熵和信息感兴趣的作者和读者。 这篇是我在Entropy上面发表的第2篇文章。第一篇讨论确证和乌鸦悖论。
这篇文章先后经过5位审稿人审稿——参看文末Acknowledgments。文章争议较大, 但是最后还是支持的占了上风。这篇文章应该能让经典信息论研究者对我的语义信息论刮目相看。
中文:
用语义信息G测度解释和推广信息率-失真函数和最大熵分布(PDF) 摘要:
关键词:信息率-失真函数; Boltzmann分布; 语义信息测度; 机器学习; 最大熵; 最小互信息; 贝叶斯公式
|
English: Using the Semantic Information G Measure to Explain and Extend Rate-Distortion Functions and Maximum Entropy Distributions (published on Entropy Special Issue Information Measures) Abstract: In the rate-distortion function and the Maximum Entropy (ME) method, Minimum Mutual In-formation (MMI) distributions and ME distributions are expressed by Bayes-like formulas, in-cluding Negative Exponential Functions (NEFs) and partition functions. Why do these non-probability functions exist in Bayes-like formulas? On the other hand, the rate-distortion function has three disadvantages: (1) the distortion function is subjectively defined; (2) the defi-nition of the distortion function between instances and labels is often difficult; (3) it cannot be used for data compression according to the labels’ semantic meanings. The author has proposed using the semantic information G measure with both statistical probability and logical probability before. We can now explain NEFs as truth functions, partition functions as logical probabilities, Bayes-like formulas as semantic Bayes’ formulas, MMI as Semantic Mutual Information (SMI), and ME as extreme ME minus SMI. In overcoming the above disadvantages, this paper sets up the relationship between truth functions and distortion functions, obtains truth functions from samples by machine learning, and constructs constraint conditions with truth functions to extend rate-distortion functions. Two examples are used to help readers understand the MMI iteration and to support the theoretical results. Using truth functions and the semantic information G measure, we can combine machine learning and data compression, including semantic com-pression. We need further studies to explore general data compression and recovery, according to the semantic meaning. Key words:rate-distortion function; Boltzmann distribution; semantic information measure; machine learn-ing; maximum entropy; minimum mutual information; Bayes’ formula |