**IS4IS 2019 Berkeley**

**Semantic Information G Theory with Formulas for Falsification and
Confirmation**

Abstract: The semantic information G
theory is a natural generalization of Shannon¡¯s information theory. Replacing
*y*_{j} in log(.) of Shannon¡¯s
Mutual Information (MI)
foviarmula with *¦È*_{j}, a fuzzy
set or a predictive model, we obtain the predictive MI formula. Using truth
functions to produce likelihood functions, we have the sematic MI formula. We
can also obtain this formula via improving Carnap and Bar-Hillel¡¯s semantic
information formula *I*_{j}=log[1/*T*(*y*_{j})],
where *T*(*y*_{j})
is
the
logical
probability
of
hypothesis *y*_{j}. The improved
formula is *I*_{ij}=log[*T*(*y*_{j}|x_{i})/*T*(*y*_{j})]=log[*P*(*x*_{i}|*¦È*_{j})/*P*(*x*_{i})],
where *x*_{i }is an instance,
*T*(*y*_{j}|x_{i})=*T*(*¦È*_{j}|x_{i})
is the fuzzy truth value of proposition *y*_{j}(*x*_{i}),
and *T*(*y*_{j}) is the
average of *T*(*y*_{j}*|x*).
Using a Gaussian function without coefficient as the truth function, we can find
that log*T*(*y*_{j}|x_{i})
reflects deviation and testability. According to this formula, the larger the
deviation is, the less information there is; the less the logical probability
is, the
larger the absolute value of
information is; wrong hypotheses will convey negative information,
and the information conveyed by a tautology or a contradiction is zero. Hence,
this formula accords with Popper¡¯s thought about hypothesis-testing and
falsification. To average *I*_{ij},
we have the Generalizzed Kullback-Leibker (GKL) formula and the semantic MI
formula. We can use the GKL formula and sampling distributions to optimize
likelihood functions and truth functions for machine learning and induction. A
hypothesis *y*_{j} with a degree of belief
*b *can be treated as the mixture of
*y*_{j} and a tautology with truth function
*bT*(*y*_{j}|x)+1-*b*.
Using a sampling distribution to optimize *
b*, we can obtain confirmation measure *
b**=[*P*(*H|E*)-*P*(*H|E*¡¯)]/max[*P*(*H|E*),*P*(*H|E*¡¯)]=[*CL-CL*¡¯]/max[*CL,CL*¡¯], where *H*=*y*_{j},
*E* and *E*¡¯ are positive and
negative instances respectively, *CL* is
the confidence level, and *CL*¡¯=1-*CL*.
The *b** has HS symmetry suggested by
Eells and Fitelson. It ensures that
decreasing negative examples is more important than increasing positive examples
and hence is compatible with Popper¡¯s falsification thought.

References:
https://arxiv.org/abs/1809.01577
and
https://arxiv.org/abs/1609.07827