语义信息和语义压缩研究 Semntic Information

鲁晨光中文主页 English Homapage Papers on ArXiv Recent papers about Semantic Information and Statistical Learning

2021年8月发表的一篇英文文章——用语义信息G测度解释和推广信息率-失真函数和最大熵分布。下面是中英文对照。

Entropy 是MDPI出版公司出版的一个专业期刊，汇集了全球许多对熵和信息感兴趣的作者和读者。这篇是我在Entropy上面发表的第2篇文章。第一篇讨论确证和乌鸦悖论。

这篇文章先后经过5位审稿人审稿——参看文末Acknowledgments。文章争议较大，但是最后还是支持的占了上风。这篇文章应该能让经典信息论研究者对我的语义信息论刮目相看。

摘要:

在信息率-失真函数和最大熵方法中，表示最小互信息分布和最大熵分布的公式就像是贝叶斯公式——包含负指数函数和划分函数.为什么这些非概率函数存在于类贝叶斯公式之中？另一方面，信息率-失真函数有三个缺点：(1)失真函数是主观定义的；(2)实例和标签之间的失真的定义往往非常困难；(3)不能根据标签的语义做数据压缩。作者曾提出语义信息G测度——它同时含有统计概率和逻辑概率。据此，我们能把负指数函数解释为真值函数，把划分函数解释为逻辑概率，把类贝叶斯公式解释为语义贝叶斯公式，把最小互信息解释为语义互信息，把最大熵解释为极度最大熵减去语义互信息。为了克服上述缺点，本文建立真值函数和失真函数之间的联系，通过机器学习从样本分布获得真值函数，用真值函数构造约束条件从而推广信息率-失真函数。其中用两个例子帮助读者理解最小互信息迭代并支持理论结果。使用真值函数和语义信息G测度，我们可以结合机器学习和数据压缩（包括语义压缩）。为探讨一般数据的语义压缩和恢复，我们需要做进一步研究。

关键词：信息率-失真函数; Boltzmann分布; 语义信息测度; 机器学习; 最大熵; 最小互信息; 贝叶斯公式

English:
Using the Semantic Information G Measure to Explain and Extend Rate-Distortion Functions and Maximum Entropy Distributions
(published on Entropy Special Issue Information Measures )

Abstract:
In the rate-distortion function and the Maximum Entropy (ME) method, Minimum Mutual In-formation (MMI) distributions and ME distributions are expressed by Bayes-like formulas, in-cluding Negative Exponential Functions (NEFs) and partition functions. Why do these non-probability functions exist in Bayes-like formulas? On the other hand, the rate-distortion function has three disadvantages: (1) the distortion function is subjectively defined; (2) the defi-nition of the distortion function between instances and labels is often difficult; (3) it cannot be used for data compression according to the labels’ semantic meanings. The author has proposed using the semantic information G measure with both statistical probability and logical probability before. We can now explain NEFs as truth functions, partition functions as logical probabilities, Bayes-like formulas as semantic Bayes’ formulas, MMI as Semantic Mutual Information (SMI), and ME as extreme ME minus SMI. In overcoming the above disadvantages, this paper sets up the relationship between truth functions and distortion functions, obtains truth functions from samples by machine learning, and constructs constraint conditions with truth functions to extend rate-distortion functions. Two examples are used to help readers understand the MMI iteration and to support the theoretical results. Using truth functions and the semantic information G measure, we can combine machine learning and data compression, including semantic com-pression. We need further studies to explore general data compression and recovery, according to the semantic meaning.
Key words：rate-distortion function; Boltzmann distribution; semantic information measure; machine learn-ing; maximum entropy; minimum mutual information; Bayes’ formula