[1], which focused on minimizing properties of Fibonacci numbers in relation to Huffman codes ..... Then a i = a 3 = 2(Un_ 3 - 1) > 2(u4_ 3 - 1) = 2(u 1 - 1) = 0.
H U F F M A N CODES AND M A X I M I Z I N G PROPERTIES OF FIBONACCI N U M B E R S
A. B. Vinokur
UDC 519.1
A contrast Huffman code (of maximum length) and the corresponding contrast sequence o f positive numbers are considered. A maximizing contrast sequence, the maximum cost of a contrast Huffman code, and their relationship with Fibonacci numbers are derived.
We consider the extremal properties of Huffman codes and establish their relationship with Fibonacci numbers. Unlike [1], which focused on minimizing properties of Fibonacci numbers in relation to Huffman codes (trees), this paper examines their maximizing properties.
MAIN CONCEPTS. STATEMENT OF THE PROBLEM Let P = (Pl ,P2 .... ,Pn) be a sequence of positive numbers such that P~ = P2, Pi "~ P,i+l (i = 2, n -
~
(1) 1),
p~ = 1.
(2) (3)
i=l
A prefix binary code for the sequence P is the set X = {xl,xz,...,xn} where: 1) x i is the code word corresponding to the numberpi (i = 1,...,n); 2) x i is a word in the binary alphabet {0, 1}; 3) no code word x/is the head of any other code word xj (i ;e j). The number of symbols in the code word x i is called the length of the code word and is denoted by lr The sum L = L(X) = ~ i = l n li is called the length of the code X, S = S(P, X) = ~i=l n Pili is the cost (or the average code word length) of the code X for the sequence P. The number of elements of the sequence P (the code X) is called the size of the sequence (the code). The set of all prefix codes of size n will be denoted by K n. The code Smi n = Xmin(e) is called minimal for the sequence P if S(P, Xmin) = minxEKn S(P, X). The method of construction of a minimal prefix code for an arbitrary sequence was proposed by Huffrnan [2] (see also [3, p. 495]). The code H = H(P) constructed by this method is called Huffman code for the sequence P (note that not every minimal code Smin is a Huffman code [3, p. 497]. Thus, S (P, H) = rain S (P, X). XCK n
The set of Ituffman codes of size n is denoted M n (Mn C Kn). For different sequences of a fixed size n, Huffman codes in general have different lengths L. Defin#ion 1. The Huffman code C = C(n) of size n is called a contrast code ifL(C) = maxH6Mn L(H).
Translated from Kibernetika i Sistemnyi Analiz, No. 3, pp. 10-15, May-June, 1992. Original article submitted December 5, 1989. 1060-0396/92/2803-0329512.50
©1993 Plenum Publishing Corporation
329
The code C is called a contrast code because the minimum cost of the code (for a corresponding sequence) is achieved for the maximum length of the code. Definition 2. The sequence Q = Q(n) of size n for which a contrast Huffman code exists is called a contrast sequence. Applying the construction procedure for the Huffman code [3, p. 495], we can easily obtain the parameters of the contrast code C = {c 1..... cn} and the conditions on the contrast sequence Q = {ql ..... qn}: Cl = O lt=I'(c0
,
~-1
n--l,
ct =
( i = 2 ,- - n);
0 ~-~ 1
lt=l(ci)=n--i+l ~
LIC)=
1)+ i=1
1)=
( i = 2 , n); (4)
(n-- 1)(n+2). 2
i=2
S(Q, C ) =
qilt = ( n - - 1)qi-t- E ( n - ~=1
i-t- 1)qi;
~=~
(5)
/e--2
E
qt~q~
( k = 3 , n).
i=1
The condition (5) is necessary and sufficient for the sequence to be a contrast sequence (of course, given conditions (1)-(3)). The set of all contrast sequences of size n will be denoted E n. Definition 3. The contrast sequence Qm~ = Qmax(n) of size n is called maximizing if S(Qma x, C) = maxQcEn S(Q,
c). The cost Tmax(n) = S(Qmax , C) (Qmax E En) is called the maximum cost of the contrast Huffman code of size n. In this paper, we construct the maximizing contrast sequence Qmax and determine the maximum cost Tmax(n) for a contrast Huffman code.
A U X I L I A R Y R E L A T I O N S H I P S AND R E S U L T S Fibonacci numbers are defined as follows [4, p. 9]: u 1 = 1, u 2 1, u i = ui_ 1 + ui_ 2 (i >_ 3). By definition, u o = 0. Fibonacci numbers satisfy relationships that are given in [4, pp. 1 1, 15] or can be easily derived: =
k
ut=u~+~--
1,
(6)
( i + ] 9 3),
(7)
i=I
ut+i~ut
+ us
k
(k - - i + 1) ut = uk+4-- (k + 3).
(81
[ (n, i) = un-i+tun+3 "-- un-~+aun+i.
(9)
i=l
Now let
L E M M A 1. For i _< n, [(n,i)=I
[ --
ut, tit,
if! n = i (rood2), if n = ~ i (rood 2).
Proof By definition of Fibonacci numbers, we have "f (n, i)=Un--i+l (Un+2 "3UUn+l) --Un-~l (Un--iq-2 "~-.Un--i+l) = u._t+~un+2--un_~+21tt.+l = un-t+l (u.+l q- un) - - un+l (u.-t~-,1 -t- u~-O = u~-~+lUn - - u~-iu.+l.
330
Thus
f (n, i) = ttn-~+lu,~ - - un-iu~+l. Now, f ( n , i ) = U n ( U n _ i -Ju U n - i - l ) - - U n - - l ( U n ' d I- U n = l ) (u~-i-1 + u~_~_~) = u~-i-lu~-zi - - u ~ - i - 2 u ~ : - l - = - f ( n - 2, i).
(10)
Un--i--lUni --Un--iUn--1
=
= Un--i--1 ( U n - - I +
Un--2)--Un_ltX
Therefore f(n, i) satisfies the recurrence
f(n, i ) - - - f ( n - - 2 ,
(1t)
i).
Case 1. n = i (mod 2), i.e., n and i have the same parity. Applying (11) (n - i)/2 times to (10), we obtainf(n, i) = f(i,
i) = u i _ i + l U
i -- ui_iui+
1 = U l U i - - U o U i + 1 = U i.
Case 2. n ~ i (mod 2). Applying (11) (n - i - 1)/2 times to (10), we obtainl f (n, i,) = [ (i + 1, i ) = U~+t--i+lUi+~-Ui+~lli-~-[~-i ; "~" U2Ui+I - - L/tUi+2 ~ Igi+l - - Ui@ 2 ~ - - U i. Q.E.D. For a contrast sequence denote k--2
Ah=q~--Eqi
(k==3, n).
(12)
i=i
This and (5) give Ah~0
( k = 3 , n).
(13)
Let Q = (ql .... ,qn) be an arbitrary contrast sequence of size n (using (1), we denote ql -- q2 = k). k--2
T H E O R E M 1. qn : Luk-i + Ah + E Aittk-i-1 (k = 3, n). i=3 k--2
Proof Let zh = Luk_l + A~ + ~_~ A i l t l e _ i _ l .
Using relationship (12) for Ai, we have /~--2
k--2
k--2
i--2
zh -= ~uk-i + qh ÷ E q i u e - - , - 1 - - E q i - Z E U k - - , - - , ~ q , . /=3
/=1
i=3
]~l
Consider separately the last (fifth) term in the expression for zk. Changing the order of summation and using (6), we obtain k--2
i--2
k--4
k--4
u~-'-xEqi:Eqiuk-i-1--
E
q,.
Substituting this expression for the last term in zk, we obtain after some manipulations Zk = ~ttk--I -~- qh + (qk_2Ut -3t- qk_3U2 - - q2uk--3 ---" qtu~--2) - - (q~--2 + qk--Z)" Seeing that u 1 = u 2 = 1, ql = q2 = ~', we obtain zh = ~u~_l + q~ - - ~, (uk-3 + uk-2) = ~uk_a + q~ - - ~u~-I = q~.
Q.E.D. Let us now apply Theorem 1 to compute k. We have
~q~
= 2x +
+ k=3
+ k=3
k=5 i=3
Applying (6) to the second term and changing the order of summation in the last (fourth) term, we obtain
~ k=l
qh = ~un+l + ~
Aiun_t+1.
i:3
331
Using (3), we have
XUn+l @ ~ A i u n - i + l = l i:3
whence
1-- ~ ~iUn--t+l
(14)
Un+l
Let us now determine the cost o f the contrast code C for an arbitrary contrast sequence Q. Using (4) and the result of Theorem 1, we obtain for Q E E n n--1
rt
S ( Q , C ) = ~ . ( n - - 1 ) + ~ , ~ ( n - - k ) u h + ~2 ( n - - k + 1)Ak + k~l n--2
+ ~ i=3
k~3
n--i--1
Ai ~
uj(n--]--i).
]=1
Using relation (8) for the second and fourth terms, we obtain
S (Q C) = X ( u . + 3 - - 3) + ~ A ~ ( u . _ t + 3 - - 1). i=3
Substituting the expression for X from (14) in S(Q, C) we obtain n
S (Q, C) -
u.+a - - 3 ~- E /2n+l
/Xi (uh+l ( u ~ - / + 3 - 1) - - u n - i + l (un+a--3)).
/2n+l
i=3
Thus, for Q E E n we have
S(Q, C ) =
1--~(u.+s - - 3 - -
Un+l \
~ a/A~),
(15)
/=a
Un+t (u,-~+3 - - 1) (i = 3, n). T H E O R E M 2. a/ _> 0 (i = 3,...,n, n _> 3). Proof Using (9), we have ai = Un--i+lUn+3--Un--~+aUn+l q- ttn+1 - - 3U~--i~-I = f (n, i) + u,+l ~ 3u,-/+1. Case 1. n = i (rood 2). By Lemma 1 we have f(n, i) = u i and thus ai i--: i u~ + Un+! - - 3u~-~+1 _--_2. 3 (un-~ - + 2u~_3 5r ui ~ 3 (u,-2 - - u,_~+1) + 2u,_3 + u~ = 2u,L3 + ui > O.
where ai = u,-_~+l (un+3 - - 3 ) -
Case 2. n ~ i (rood 2). In this case i _< n - 1, i.e., n _> i + 1. Then by Lemma 1 we havef(n, i) = - u i and a i =
--Ui + Un+I -- 3Un_i+1. Case 2.1. i = 3 (so that n ___ 4). Then a i = a 3 = 2(Un_ 3 - 1) > 2(u4_ 3 - 1) = 2(u 1 - 1) = 0. Case 2.2. i > 3 (so that n _ 5). After the necessary manipulations, we obtain ai = 3 (u,_~ - - u/_z - - t / n _ / + l ) --t- 2(Un-3 ~/,/i--4) ~ 3 (Un--~- - ut-3 - - a,_~+l) + + 2 (ttn--3 - - tQn--l)--4) = 3
(ttn--2- -
tt/--3 - - t t n - - i + l ) + 2ttn--4 _ ~ 3
(Un--2
- - u~_3 - - u~_~+1) + 2u~ ~ 3 (u~-2 - - u~-3 - - u~_/+~) ~ 0 by (7). Q.E.D.
332
- -
MAIN RESULTS T H E O R E M 3. The maximizing contrast sequence of size n is the sequence Qmax = Qmax(n) = (ql,-.. ,qn), where ql = 1/un+l, qi = ui-1/Un+l (i = 2 ..... n). Proof From (13), (15) and Theorem 2 it follows that the maximum value for S(Q, C) (Q E En) is achieved for Ai =
0 (i = 3,...,n). This and (14) give k = I/Un+l, i.e., ql = X = 1/un+l, q2 = X = 1/un+l = ul/Un+l. Now by Theorem 1, for Ai = 0 (i = 3 .... ,n) and X = 1/un+l, we have qi = Xui-1 = Ui-1/Un+l (i = 3 ..... n). T H E O R E M 4. The maximum cost of a contrast Huffman code of size n is Tmax(n) = 2 + (u n - 3)/Un+ 1. Proof Since the maximum of S(Q, C) is achieved for A/ = 0 (i = 3,... ,n), we obtain from (15) Tmax(n)
=
(Qmax,C) - -
S
U n + 3 - - 3 __ U n + l -t- U n + 2 - - 3
Un+l
2Un+l-Jr-un--3
__--
Un+l
= 2 + ~ n u- 3
Hn+l
~n+l
limn_,o. Tmax(n) = (3 + , / ~ / 2 = 2.6180. Proof follows from the fact that unlun+ 1 "-> 2/(1 + x/~) = (~'ff-- 1)/2 as n --> ~ [4, p. 86]. Example 1. Contrast code and contrast maximizing sequence of size 6: COROLLARY.
C = C ( 6 ) = {00000, 00001, 0001, 001, 01, 1}, ( O~a~=Q~ax(6)---
1 13'
1 1 2 13' 13'
rmax(6) = 2 +
3 5 1 3 ' 13 '
=2
/ 13 '
I--'5" "
Example 2. Contrast code and contrast maximizing sequence of size 8:
C = C ( 8 ) = {0000000, 0000001, 000001, 00001, 0001, 001, 01, 1}, Q~.ax---Qm.~(S)---
34 ' 34 ' 3 4 " " 3 4 '
Tma~ (8)
2+
=
48~99 -- a
=
34 ' 3 4 ' 18
2-~--
=
2
~7
34'
3-4" '
.
Let us compare the results of this paper with [1]. Condition (3) may be called normalization condition and sequences satisfying conditions (1)-(3) may be called normalized. In [1] we have considered sequences that satisfy conditions (1), (2) and the condition Pl = 1 (instead of condition (3)). Such sequences may be called integer sequences. By Theorem 3 of this paper, a maximizing normalized contrast sequence is a normalized Fibonacci sequence right"shifted" by one element, while the minimizing sequence in Theorem 3 of [1] is an integer Fibonacci sequence right-"shifted" by one element 1, ui, u2. . . . . un-x.
(16)
Thus, when the sequence (16) is normalized, i.e., each of its elements is divided by the sum of all elements (equal to Un+l) , its extremal properties are reversed. Specifically, if the sequence (16) is minimizing for a contrast Huffman code, then the sequence 1 Un+l
/'/t '
Unnt-I '
U2 tJn-bl ' ' " '
Un--1 Un-}-I
is maximizing (each in its class of sequences).
333
REFERENCES .
2. ,
4.
334
A. B. Vinokur, "Huffman trees and Fibonacci numbers," Kibernetika, No. 6, 9-12 (1986). D. A. Huffman, "A method for the construction of minimum redundancy codes," Proc. IRE, 40, 1098-1101 (Sept. 1952). D. Knuth, The Art of Computer Programming, Vol. 1, Basic Algorithms [Russian translation], Mir, Moscow (1976). N. N. Vorob'ev, Fibonacci Numbers [in Russian], Nauka, Moscow (1984).