Abstract. When the probability distribution of a random variable changes, the uncertainty of the probabilistic system is changing, too. We determine the maximal ...
Change of entropy at an infinitesimal change of probability distribution
∗∗
V. Majern´ık∗ , and E. Majern´ıkov´a∗∗ , ∗ Institute of Mathematics, Slovak Academy of Sciences, ˇ anikova 47, Slovak Republic Bratislava, Stef´ Institute of Physics, Slovak Academy of Sciences, D´ ubravsk´a cesta 9, SK-84 228 Bratislava, D´ ubravsk´a cesta 9, Slovak Republic.
Keywords Probability distribution, Negentropy, Thermodynamic entropy.
Abstract When the probability distribution of a random variable changes, the uncertainty of the probabilistic system is changing, too. We determine the maximal change of the entropy given some constraints. This change of Shannon’s entropy is linked with the corresponding change of Clausius entropy of thermodynamic system which point out again on the linkage between the Clausius and Shannon entropy.
Keywords Probability distribution, Negentropy, Thermodynamic entropy.
1
Introduction
A remarkable event in the history of physics was the interpretation of the phenomenological thermodynamics in terms of motion and randomness. In this interpretation, the temperature is related to motion while the randomness is linked with the Clausius entropy. The homeomorphous mapping of the phenomenological thermodynamics on the formalism of mathematical statistics gave rise to two entropy concepts: the Clausius thermodynamic entropy as a thermodynamic state variable of a thermodynamic system and the Boltzmann statistical entropy as the logarithm of probability of state of a physical ensemble. The fact that the thermodynamic entropy is a state variable means that it is completely defined when the state (pressure,volume, temperature, etc.) of a thermodynamic system is defined. This is derived from mathematics, which shows that only the initial and final states of a thermodynamic system determine the change of its entropy. 1
The larger the value of the entropy of a particular state of a thermodynamic system, the less available is the energy of this system to do work. The statistical concept of entropy was introduced in physics when seeking a statistical quantity homeomorphous with the thermodynamic entropy. As it is well-known, the Clausius entropy of a thermodynamic system St is linked with its probability W by the celebrated Boltzmann law, St = KB log W , where W is so-called ”thermodynamic” probability determined by the configurational properties of a statistical system and KB is Boltzmann’s constant.1 Boltzmann’s law represents the solution to the functional equation between St and W . Consider a set of the isolated thermodynamic systems Σ1 , Σ2 , ..., Σn . According to Clausius, the total entropy of this system is an additive function of the entropies of its parts, i.e., it holds St (Σ1 + Σ2 +, ..., +Σn ) = St (Σ1 ) + St (Σ2 ) + ..., +St (Σn ),
(1.2)
On the other side, the joint ”thermodynamic” probability of this system W (Σ1 + Σ2 +, ..., +Σn ) is W (Σ1 + Σ2 , +..., +Σn ) =
n ∏
W (Σi ).
(1.3)
i=1
To obtain the homomorphism between Eqs.(1.2) and (1.3), it is sufficient that St = KB log W. which is just the Boltzmann law (Kubo, 1974). An important quantity in the theory of probability is the random variable. A random variable x ˜ is a mathematical quantity assuming a set of values with the corresponding probabilities. All data necessary for the characterization of a random trial, and the assigned random variable, are usually given by a so-called probabilistic scheme. If x ˜ is a discrete random variable then its probability scheme becomes the form S
S1
S2
...
Sn
P
P (x1 )
P (x2 )
...
P (xn )
X
x1
x2
...
xn
S1 , S2 , ..., S3 are the outcomes of a random trial (in quantum physics the quantum states), P (x1 ), P (x2 ), ..., P (xn ) are their probabilities and x1 , x2 , ..., x3 are the values defined on S1 , S2 , ..., Sn (in quantum physics the eigenvalues). A probability distribution, P ≡ {P1 , P2 , ..., Pn }, is the complete set of probabilities of all individual outcomes of a random trial. 1
The probability as well as the Shannon entropy are dimensionless quantities. On the other side, the thermodynamical entropy has the physical dimension equal to [JK −1 ]. Therefore, we must multiply the Shannon entropy by the Boltzmann constant, which has the value KB = 1.38.10−23 [JK −1 ], in order to get the correct physical dimension for the thermodynamic entropy.
2
It is well-known that there are several measures of the uncertainty in the theory of probability which can be divided into two classes : (i) The moment measures which give the uncertainty of a random trial by means of the scatter of its values. The moment measures of the uncertainty contain as a rule the values of a random trial as well as the elements of its probability distribution and are often taken as their central statistical moments : (ii) The probabilistic or entropic measures of uncertainty containing in their expressions only components of the probability distribution of a random trial. They determine the sharpness and spreading out of its probability distribution, independent of its actual value of x ˜. H(P) is written as a sum of functions of the individual components of P Functions fp (Pn ) must to be chosen so that their sum satisfies the above demands ask from an entropic measure of uncertainty. It must get zero if Pm = 1 or 0, and it must be graphically represented by a concave curve. There are several functions of the probability components which satisfy these demands. The most important are the following: (1) (2) fp (Pm ) = −Pm log Pm and fp (Pm ) = Pm (1 − Pm ). If we take for the uncertainty (1) measure the sum the functions fp , we have H(P) = S(P) =
n ∑
fp(1) (Pi ) = −
i=1
n ∑
Pi log Pi .
(1)
i=1
This is the well-know entropic measure of uncertainty called the Shannon entropy.
2
Change of Shannon entropy
Let the probability distribution of a discrete random variable x ˜ infinitesimally changes from P to P + δP as it is shown in the following probabilistic scheme S
S1
S2
.
.
.
Sn
x
x1
x2
.
.
.
xn
P
P (x1 )
P (x2 )
.
.
.
P (xn )
P + δP
P (x1 ) + δP (x1 )
P (x2 ) + δP (x2 )
.
.
.
P (xn ) + δP (xn )
Here, the equations are to be satisfied n ∑
P (xi ) = 1
i=1
and
n ∑
δP (xi ) = 0.
i=1
The change of entropy ∆S is given by the relation △S = S(P + δP) − S(P), 3
(5.1)
where S(P) is entropy of x ˜. Substituting the components of P and P + δP into the formula for Shannon’s entropy we have △S = −
n ∑
[(P (xi ) + δP (xi )) log(P (xi ) + δP (xi )) − P (xi ) log P (xi )].
i=1
By assuming δP (xi ) ≪ P (xi ), the expression P (xi )(1 + δP (xi )/P (xi )) can be expanded into a series log P (xi )(1 + δP (xi )/P (xi )) = log P (xi ) + δP (xi )/P (xi ) − (δP (xi )2 /2(P (xi ))2 + ... (5.2) When neglecting in (5.2) terms of the higher order in δPi , we obtain △S = −
n ∑
δP (xi )(1 + (δP (xi )/2P (xi )) + log P (xi )).
i=1
3
Maximalization of entropy change
Now, we will seek the maximum entropy change if the change of mean value of x ˜ is given, i.e., subject to the following boundary conditions n ∑
δP (xi ) = 0
and △x =
i=1
n ∑
δP (xi )xi .
(5.3)
i=1
Using the method of Lagrange multipliers we find δP (xi )(m) = Pi (µ1 xi + µ2 − log P (xi )), where µ1 and µ2 are to be determined by (5.3). For the continuous probability distribution Pc , given by the density function p(x), we express the entropy change △S by means of a small additional function f (x) attached to the density function p(x). Substituting p(x) and p(x) + f (x) into the expression for differential entropy we find △S = −
∫
∫
[(p(x) + f (x)) log(p(x) + f (x))]dx +
p(x) log p(x)dx
By assuming f (x) ≪ p(x), the expression log(p(x) + f (x)) can be expanded into a series log(p(x) + f (x)) = log[p(x)(1 + f (x)/p(x))] = log p(x) + log(1 + f (x)/p(x)) = log p(x) + f (x)/p(x) − f (x)2 /(2p2 (x)) + ... By neglecting terms of the higher orders in f (x), we have △S = −
∫
f (x)[(f (x)/2p(x) + log p(x)].dx
4
Similarly as in the discrete case, we can find the function f (x) which extremizes △S subject the boundary conditions △x =
∫
∫
xf (x)dx
and
f (x)dx = 0
(5.4)
Using again the method of Lagrange multipliers we obtain f (x) = p(x)[µ1 x + µ2 − log p(x)],
(5.5)
where the Lagrange multipliers are determined trough the equations: △x = and
∫
∫
0=
x p(x)[µ1 x + µ2 − log p(x)]dx
p(x)[µ1 x + µ2 − log p(x)]dx.
By eliminating µ1 and µ2 we find △x + SX + R M − X2
µ1 = and
△xX + SM + RX , X2 − M
µ2 = where
∫
X=
xp(x)dx ∫
M= S=−
x2 p(x)dx
∫
p(x) log p(x)dx
∫
R=
xp(x) log p(x)dx
Substituting µ1 and µ2 into (5.5) we obtain f (x) = p(x)[(
4
△x + SX + R △xX + SM + RX )x + − log p(x)]. 2 M −X X2 − M
Thermodynamically
Let us apply these results in thermodynamics. Consider a statistical thermodynamic system of the ideal gas. According to classical statistical physics the probability density of energy E over the individual molecules is (Kubo, 1974) P (E, T ) =
1 KB T 5
exp[−
E ], KB T
where T is the temperature and KB the Boltzmann’s constant. When changing the temperature then the Shannon entropy changes, too. This change is given by the equation △Sinf = −
∫ 0
∞
1 E 1 E exp(− log[ exp(− )]dE KB (T + △T ) KB (T + △T )) KB (T + △T ) KB (T + △T ∫
∞
+ 0
1 KB T
exp(−
E 1 E ) log[ exp(− )]dE. KB T KB T KB T
(5.6)
Neglecting in (5.6) terms of the higher order in △T , we have △Sinf =
△T . T
In the simplest case, when the thermodynamic system does not perform work, the change of temperature is proportional to the amount of the supplied heat K△Q (K being a constant). Since the increase of the thermodynamic entropy is given as △Sterm = △Q/T we have △Sterm = △Sinf /K (5.7) Eq.(5.7) illustrates one more the intimate relationship between the Shannon and thermodynamic entropy.
5
References
Jaynes E T 1957 Phys. Rev. 106 620 and Phys. Rev. 108 171. Kubo,R.(1974) Statistical Mechanics. North-Holland Pub. Comp., Amsterdam.
6