we are able to give an algebraic characterisation of the computable data types and data structures. Our technical motivation are the simple notions of strong and ...
A CHARACTERISATION OF COMPUTABLE DATA TYPES BY MEANS OF A FINITE EQUATIONAL SPECIFICATION METHOD
J.A. BERGSTRA
Department of Computer Science, University of Leiden, Wassenaarseweg 80, 2300 RA LEIDEN, The Netherlands J.V. TUCKER
Department of Computer Science, Mathematical Centre, 2e Boerhaavestraat 49, 1091 AL AMSTERDAM, The Netherlands
INTRODUCTION By redefining the construction of finite equational hidden function specifications of data types, as these are made with the initial algebra methodology of the ADJ Group, we are able to give an algebraic characterisation of the computable data types and data structures. Our technical motivation are the simple notions of strong and weakly normalised Church-Rosser replacement systems studied in the l-Calculus, and in plain mathematical terms the theorem we prove is this. THEOREM. Let A be a many-sorted
signature.
algebra finitely generated by elements named in its
Then the following statements are equivalent:
i. A is computable. 2. A possesses a finite, equational which is Church-Rosser 3. A possesses a finite, which is Church-Rosser
hidden enrichment replacement
system specification
and strongly normalising° equational hidden enrichment replacement
system specification
and weakly normalising.
The unexplained concepts are carefully defined in Section 2, on replacement system specifications, and in Section 3, on computable algebras. In Sections 4 and 5 we prove the theorem. Section i explains in detail the theoretical issues to do with data types which the theorem attempts to resolve. This paper continues our studies on the adequacy and power of definition of algebraic specification methods for data t~rpes which we began in [i~, see also [53.
(It
is an edited version of [2]; subsequent papers are [3,43.) Here the reader is assumed well versed in the initial algebra specification methods of the ADJ Group [63, see also KAMIN [73; knowledge of our [i~ is desirable but not, strictly speaking, essential. Prior exposure to the l-calculus is no~ required, of course, but hopefully the reader is acquainted with the Church-Rosser property from ROSEN [ii]. i. INITIAL ALGEBRA SEMANTICS AND DATA TYPES
A data structure
is defined to be a many-sorted algebra A finitely generated
77
from initial v a l u e s a l,...,a n ~ A n a m e d in its signature Z. A data type is d e f i n e d to be any class K of such d a t a structures of common signature. A t the h e a r t of the ADJ Group's theory of d a t a types is the idea that the semantics w h i c h e h a r a c t e r i s e a data type K should be i n v e s t e d in the c o n s t r u c t i o n of an initial a l g e b r a I K for K w i t h the result that every data structure A 6 K is u n i q u e l y definable, up to isomorphism,
as an
e p i m o r p h i c image of IK. In its turn, this initial a l g e b r a IK is u n i q u e l y definable, a g a i n up to isomorphism, as a factor a l g e b r a of the s y n t a c t i c a l g e b r a T(Z) of all terms over Z b e c a u s e T(Z)
is initial for the class ALG(Z) of all E-algebras. L e t IK ~ T ( Z ) / ~ K
w h e r e zK is a c o n g r u e n c e for w h i c h t £K t' m e a n s t h a t the terms t a n d t ' a r e
identical
syntactic e x p r e s s i o n s as far as the semantics of K is concerned. O b s e r v e that in these c i r c u m s t a n c e s we may p l a u s i b l y call a d a t a type
(semantics for) K computable w h e n
K is decidable on T(Z). And the p r o b l e m of s y n t a c t i c a l l y s p e c i f y i n g the data type K can
be i n v e s t i g a t e d through the p r o b l e m of s p e c i f y i n g the congruence =K" The p r e f e r r e d m e t h o d of p r e s c r i b i n g ~K is to use a finite set of equations or c o n d i t i o n a l e q u a t i o n s E over T(Z) T(Z)
to e s t a b l i s h a b a s i c set of i d e n t i f i c a t i o n s D E c
x T(Z) and to take z K as the s m a l l e s t c o n g r u e n c e =E on T(Z)
containing DE
With
reference to our [13, it is k n o w n that this m e t h o d w i l l not define all C o m p u t a b l e data types, b u t that it is able to define m a n y n o n - c o m p u t a b l e ones. E n r i c h i n g the m e t h o d to a l l o w the use of a finite n u m b e r of h i d d e n functions does enable it to s p e c i f y any c o m p u t a b l e data type, b u t can be shown to expand the n u m b e r of n o n - c o m p u t a b l e data types it defines. Our p r o p o s a l h e r e is to d e t e r m i n e the c o n g r u e n c e s for initial algebras b y m e a n s of replacement systems. A r e p l a c e m e n t system is i n t e n d e d to formalise a system of deductions,
g o v e r n e d b y simple algebraic s u b s t i t u t i o n rules, w i t h i n w h i c h a d e d u c t i o n
t ÷ t' says that the rSle of t can be p l a y e d b y t' as far as the semantics of K is concerned, m e a n i n g t + t' implies t =K - t' ' b u t n o t conversely. W h a t w e do is to m a k e an
analysis o f the c o n g r u e n c e =K t h r o u g h the s t r u c t u r a l b e h a v i o u r of a l g e b r a i c a l l y styled "proof systems" for it a n d f r o m this, a n d an a p p r o p r i a t e s p e c i f i c a t i o n machinery, w e are able to g u e s s the c l a s s i f i c a t i o n t h e o r e m for c o m p u t a b l e data types. T h a t this is
precisely the t h e o r e m stated in the I n t m o d ~ c t i o n comes from the r e f l e c t i o n that the semantics of a type K is supposed to be u n i q u e l y d e t e r m i n e d up to i s o m o r p h i s m w i t h an initial a l g e b r a IK, and n o t b y a p a r t i c u l a r s y n t a c t i c a l construction.
Since the comput-
ability of =K means the c o m p u t a b i l i t y of the algebra T(Z)/Z K u n d e r our d e f i n i t i o n and, in particular,
since this notion of an algebra's c o m p u t a b i l i t y is an i s o m o r p h i s m in-
variant, we can erase all m e n t i o n of syntax in the semantical c o n c e p t of a c o m p u t a b l e data type a n d i d e n t i f y these w i t h the c o m p u t a b l e algebras. So w i t h r e g a r d to the c o n t e n t of our theorem, the r e a d e r m a y care to c o n s i d e r the ease w i t h w h i c h s t a t e m e n t
(3) implies
(I) is p r o v e d as e v i d e n c e for the n a t u r a l signi-
ficance 6f strong and w e a k l y n o r m a l i s e d C h r u c h - R b s s e r r e p l a c e m e n t s y s t e m s p e c i f i c a t i o n s w h i l e the i m p l i c a t i o n
(1) implies
(2) m a y b e c o n s i d e r e d as the h a r d w o n answer to the
q u e s t i o n a b o u t adequacy: Do these specifications define all the data types one wants? B e c a u s e of the n o v e l t y of the s p e c i f i c a t i o n technique and the i n v o l v e d p r o o f of
78
the t h e o r e m we shall w o r k Q~t the m a t e r i a l which
it b e c o m e s
m u c h easier
of the t h e o r e m in the m a n y - s o r t e d In w h a t follows
2. R E P L A C E M E N T
The technical relation.
~ denotes
SYSTEMS
in the case of a single-sorted
for us to explain,
and the reader
algebra after
to understand, the p r o o f
case.
the set of n a t u r a l
numbers.
AND T H E I R S P E C I F I C A T I O N
p o i n t of d e p a r t u r e
is the idea of a t r a v e r s a l
L e t A b e a set a n d ~ an e q u i v a l e n c e
relation
for an e q u i v a l e n c e
on A. A traversal for ~ is a set
J c A wherein (i)
for e a c h a 6 A there is some t £ J such t h a t t H a; and
(ii)
if t,t'
Consider
6 J and t ~ t' then t ~ t'.
an initial
signature
algebra
specification
for a d a t a type K w h e r e
of K and E is some f o r m u l a or other for a x i o m a t i s ~ n g
the d e f i n i n g
congruence
H K is =E" The choice of a t r a v e r s a l
tional v i e w of the type as it is specified: imagines
(Z,E)
having
completing
to use E to c a l c u l a t e
these d e d u c t i o n s
theory about algebraic to a b s t r a c t
its p r o p e r t i e s
so that
J for H E fixes an opera-
g i v e n t,t'
E T(Z),
to decide
their p r e s c r i b e d
"normal
forms"
- t' one t =E
n,n' e J a n d on
t +E n, t' + E n' one checks n = n'. T h e f o l l o w i n g b i t of
replacement
the b a r e e s s e n t i a l s
systems
is m a d e up w i t h this in m i n d and is m e a n t
of such an o p e r a t i o n a l
L e t A b e a set a n d let R b e a reflexive, R as ÷ R so to d i s p l a y m e m b e r s h i p to b or t h a t b is a reduct
Z gives the
(a,b)
v i e w o f d a t a type specification.
transitive
binary relation
e R b y a + R b and say a reduces
(under R or +R)
on A. We write
(under R or ÷ R )
of a; and we shall call R and + R a reduction
system or a replacement system on the set A. F o l l o w i n g the t e r m i n o l o g y of t h e l-Calculus we m a k e these distinctions: An e l e m e n t
a c A is a normal form for + R if there is no b ~ A so t h a t a ~ b and
a + R b; the set of all normal The reduction
forms
for + R is d e n o t e d NF(R).
s y s t e m + R is Church-Rosser
if for a n y a ~ A if there are b l , b 2 6 A
so t h a t a + R bl a n d a + R b2 then there is c e A so t h a t b I + R c a n d b 2 + R csystem ÷ is weakly normalising if for e a c h a £ A there is some R f o r m b e A so t h a t a ÷ R b.
The reduction normal
The r e d u c t i o n
s y s t e m ÷ R is strongly normalising if there does n o t exist an infi-
nite chain
a0 + ~ a l ÷~ . . . . wherein
R an
+~ ...
for i e ~, a i # ai+ I.
A reduction system is Church-Rosser and weakly normalising if, and only if, every element reduces to a unique normal form. C l e a r l y s t r o n g n o r m a l i s a t i o n
ent&ilsweak
nor-
malisation. L e t HR d e n o t e exercise
the s m a l l e s t
to show that for a,a'
a HR a' ~=~ there
equivalence
is a s e q u e n c e
there exists
relation
on A c o n t a i n i n g
+R"
It is an e a s y
~ A a = bl,...,bk
a c o m m o n r e d u c t ci,
= a' such t h a t for each p a i r b i , b i + 1 i ~ i ~ k-l.
79
Schematically: a : bI
b2
cI
b3
c2
bk_ 1
b k = a'
Ck-i
U s i n g this c h a r a c t e r i s a t i o n of ~R it is s t r a i g h t f o r w a r d to p r o v e these facts. 2.1. LEPTA. a,a'
The r e p l a c e m e n t s y s t e m ÷ R on A is C h u r c h - R o s s e r if, and only if, for any
{ A i f a ~R a' then there is c { A so that a ÷ R c and a' "+R c.
2.2. LEPTA. Let ÷ the set of normal
b e a C h u r c h - R o s s e r weakly n o r m a l i s i n g r e p l a c e m e n t system on A. Then
R
forms NF(R)
is a traversal for =R"
Suppose n o w t h a t A is an algebra. T h e n b y an a l g e b r a i c r e p l a c e m e n t s y s t e m + R on the algebra A we m e a n a r e p l a c e m e n t system ÷ R on the domain of A w h i c h is closed u n d e r its o p e r a t i o n s in the sense that for each k - a r y o p e r a t i o n c of A,
al +R bl . . . . . ak+R bk
implies g(al . . . . . ak) +R g 0 and assume as i n d u c t i o n h y p o t h e s i s
that if r l , s l , u I are s t r o n g l y n o r m a l i s i n g and X(rl,s I) < n then _l(rl,sl,u I) is strongly normalising. equation
C o n s i d e r a r e d u c t i o n s e q u e n c e from t in w h i c h the f i r s t a p p l i c a t i o n of
(0) p r o d u c e s l(S(r'),s',_g(S(r'),s'))
from _l(r',s',S(T)) . B y our a s s u m p t i o n s
and the m a i n i n d u c t i o n h y p o t h e s i s all subterms of the n e w r e d u c t are s t r o n g l y n o r m a l ising. Moreover,
x(S(r'),s')
< x(r',s')
= X(r,s) = n and t h e r e f o r e b y the l a t e s t in-
d u c t i o n h y p o t h e s i s _i(S (r') ,s' ,g (S (r') ,s' ) ) is s t r o n g l y n o r m a l i s i n g a n d the r e d u c t i o n s e q u e n c e m u s t terminate. CASE 6. 1(x) = f(x) . This is, by now, obvious. H a v i n g c o n c l u d e d the p r o o f of L e m m a 4.3 we have also c o n t l u d e d the a r g u m e n t for L e m m a 4.2.
Q.E.D.
5. T H E M A N Y SORTED CASE We assume the r e a d e r t h o r o u g h l y a c q u a i n t e d w i t h the t e c h n i c a l foundations of the a l g e b r a of m a n y - s o r t e d structures for w h i c h no r e f e r e n c e can b e t t e r s u b s t i t u t e for the A D J ' s b a s i c p a p e r [6~. In n o t a t i o n c o n s i s t e n t w i t h our [IX, we assume A to be a m a n y - s o ~ t e d a l g e b r a w i t h domains A I , . . . , A n + m and o p e r a t i o n s of the form ~,~
= ~ l ' ' ' ' ' X k ;~ : A ~ I × . . . × A I k ,~,~ A
w h e r e I . ~ £ {l,...,n+m}, I ~ i ~ k. 1 The c o n c e p t s and m a c h i n e r y of Section 2 m u s t be reformulated, difficult:
b u t this is not
An algebraic replacement system R on A consists of a c o l l e c t i o n of set-
theoretic r e p l a c e m e n t systems R I , . . . , R n on its domains w h i c h s a t i s f y the p r o p e r t y that I,~ 0Z A, w i t h arguments a l l , . . . , a l k and b11,...,blk, w h e r e a ,b e A , If a ÷ b , ,a ÷ b then ~ l,~ (a , .. ,a ) ÷ ~i li li " 11 R1 11 "'" ~k Rk ~k 11 • Ik R~ I,~ (bll,...,blk). The c l a s s i f i c a t i o n of r e p l a c e m e n t systems a n d the d e f i n i t i o n s of for each o p e r a t i o n ~
the a s s o c i a t e d congruence, o n e - s t e p r e d u c t i o n s and so on as families of single sorted
88
relations p r o c e e d along the lines established single-sorted mdchanisms
to m a n y - s o r t e d
for specifying
To lift Section
algebras;
replacement
for generalising
algebraic
this is true of their properties
ideas from and of the
systems.
3 to computable m a n y - s o r t e d
algebras
is also quite straight-
forward and, in fact, has been virtually written out already in our [13. Those lemmas pertaining
to replacement
tion of sort indices
system specifications
require only the appropriate
introduc-
into their proofs.
Up to and including the proofs that full theorem in its m a n y - s o r t e d
(2) implies
(3), and
(3) implies
(i), for the
case, it m a y be truly said that no new ideas or tech-
niques are required. Consider
the proof that
ject of this section)
(i) implies
(2). With the help of a trick
(the real sub-
we are able to construct this proof with the toolkit of Section
4. Dispensing with an easy case where all the domains of A are finite, we assume A to be a m a n y - s o r t e d
computable
algebra with at least one domain infinite.
W i t h o u t loss of generality we can take these domains to be A I , . . . , A n, BI,...,B m where the A i are infinite and the B.l are finite of cardinality b.l + I. The generalised Lemma 3.1 provides
us with a recursive m a n y - s o r t e d
algebra of numbers R with domains
~l,...,~n and FI,...,F m where ~l = w for i ~ i ~ n, F i = {0,1,...,b i} for i -< i -< m, and R is isomorphic
to A. When not interested
in the cardinality
of a domain of R we
refer to it as R., i ~ i ~ n+m. The aim is to give R a finite equational I tion replacement system specification.
hidden func-
The first task is to build a recursive number algebra R 0 by adding to R new constants and functions. first infinite ing functions
The main idea is to code the many-sorted
sort ~I by means of functions
R i ÷ ~1 and ~i ÷ Ri affd recursive
on ~I associated to the m u l t i s o r t e d
we shall dissolve
algebra R into its
operations
track-
of R. At the same time
the finite sorts b y adding them as sets of constants.
Here is the
formal construction. For each infinite and as
the successor
new
sort i we add as a n e w constant of sort i the number 0 E ~i
function x+1. For each finite sort i we add
all
the elements of F. l
constants.
Each domain R. is coded into ~1 by adding the function foldl(x) = x, and is re1 covered b y adding the function unfoldl: ~ + R , defined for infinite sorts i b y 1 unfoldl(x) = x, and for finite sorts i b y
unfoldl(x)
= ix
if x ~ b i
Ib i
otherwise.
Next we add for each operation k
~i w h i c h commutes
f = fl,~ of R a recursive tracking function
the following diagram: f
X1
fold x...×fold
Xk
Rxlx'''XRxk
~ i~
1 ~ix...X~l
~ .....
~ el
unfoldp
f:
89
And,
just as in the s i n g l e - s o r t e d case, we factorise f into functions t , h , g and add
these a l o n g w i t h all the p r i m i t i v e recursive functions a r i s i n g from the p r i m i t i v e recursive d e f i n i t i o n s of h and g. T h a t is all. Observe R0] ~ = ~ = R, so it remains to give a finite e q u a t i o n a l r e p l a c e m e n t s y s t e m s p e c i f i c a t i o n for R 0 w h i c h is ChurchRosser and strongly normalising. Let Z0 be the signature of R 0 in w h i c h 12, iS, FOLD l, i UNFOLD name the zero, successor function, and c o d i n g maps a s s o c i a t e d to sort i; for c o n v e n i e n c e we d r o p the sort s u p e r s c r i p t in case i = i. Here are the requisite set of e q u a t i o n s E0, b e g i n n i n g w i t h the o p e r a t i o n s of R. Let f = fi,Z be an o p e r a t i o n of R n a m e d b y function symbol f { ~ c Z0 and let be its a s s o c i a t e d t r a c k i n g m a p on ~i n a m e d b y _ f e Z0" First,
f o l l o w i n g the p r o c e d u r e
of Section 4, w r i t e out all the equations a s s i g n e d to f and its factorisation. ly, add this e q u a t i o n to "eliminate"
f 1
f(Xll ..... Xlk)
~
Second-
UNFOLD~(~(FOLD
l(Xll) ....
,FOLD
1 k(Xlk))
w h e r e Xli is a v a r i a b l e of sort I i. D o this for e v e r y o p e r a t i o n of R. T u r n i n g to the coding machinery,
c o n s i d e r first the f o l d i n g functions.
For each
infinite sort i add the equations,
FOLDI(IO)
~ 0
FOLml(iS(Xi )) ~ S(FOLDI(Xi )) where X. is a v a r i a b l e of sort i. 1 For e a c h finite sort i, if i n E ~0
is a n e w c o n s t a n t of sort i d e n o t i n g num-
b e t c ~ r. then a d d 1 • FOLD i (iC)
~ SO(O).
S e c o n d l y c o n s i d e r the u n f o l d i n g functions. For e a c h i n f i n i t e sort i add the equations,
UNFOLD 1 (0) UNFOLD
i
~ i0
(S(X)) ~
iS(UNFOLDi(X) )
w h e r e X is a v a r i a b l e of sort i. For e a c h finite sort i, if
UNFOLD i (Sc (0))
i
=c is as b e f o r e then a d d the equations
> ie -
if c < b.
-> b. =l
if c ~ b. l
-
UNFOLD±(sC(x))
l
w h e r e b I• is the last e l e m e n t of F i and is n a m e d in ~ 0 - E b y b.=l; and X is a v a r i a b l e of sort I. A n d f i n a l l y we c o n s i d e r the equations for the constants. For each infinite sort i, if
ic_
~ E denotes the n u m b e r c { ~i then add ic >- isC(u). ~ For each finite sort i,
if ic 6 ~ denotes the n u m b e r e £ F. and ic { Z0 i remove the d u p l i c a t i o n b y adding ic > c. _ -
Z is its n e w c o n s t a n t symbol then we
T h i s completes the c o n s t r u c t i o n of E 0. W h a t remains of the p r o o f follows c l o s e l y the arguments of Section 4. Here the
90
sets of normal forms are, of course,
{isC (i0) : c ~ }
when i is an infinite sort, and
{ic:cEFi} when i is a finite sort. And the arguments which lift Lemma 4.1 and 4.2 are in all essential
respects the same. Given, then, that
(Z0,E0) is Church-Rosser
and
strongly normalising, the normal forms being a traversal for =E0, we can prove ~ T(E^, E ^) by using the mappings ~ i defined ~ i (c) = [ isC (0) i ] f or i an infinite R^• = U
U
,U
.
sort and @l(c) = lc for i a finite sort. REFERENCES [13
BERGSTRA, J.A. & J.V. TUCKER, Algebraic specifications of computable and semicomputable data structures, Mathematical Centre, Department of Computer Science Research Report IW 115, Amsterdam, 1979.
[23
~
, A characterisation of computable data types by means of a finite, equational specification method, Mathematical Centre, Department of Computer Science Research Report IW 124, Amsterdam, 1979.
[33
~
, Equational specifications for computable data types: six hidden functions suffice and other sufficiency bounds, Mathematical Centre, Department of Computer Science Research Report IW 128, Amsterdam, 1980.
[4]
- -
, On bounds for the specification of finite data types by means of equations and conditional equations, Mathematical C e n ~ e , Department of Computer Science Research Report IW 131, Amsterdam, 1980.
[5]
- -
, On the adequacy of finite equational methods for data type specification, ACM-SIGPLAN Notices 14 (11) (1979) 13-18.
[6]
GOGUEN, J.A., J.W. THATCHER & E.G. WAGNER, An initial algebra approach to the specification, correctness and implementation of abstract data types, in R.T. YEH (ed.) Current trends in programming methodology IV, Data structuring, Prentice-Hall, Engelwood Cliffs, New Jersey, 1978, 80-149.
[7]
KAMIN, S., Some definitions for algebraic data type specifications, Notices 14 (3) (1979) 28-37.
[8]
MACHTEY, M. & P. YOUNG, An introduction North-Holland, New York, 1978.
[9]
MAL'CEV,
A.I., Constructive
algebras,
SIGPLAN
to the general theory of algorithms,
I., Russian Mathematical
Surveys,
16 (1961)
77-129. [i03 RABIN, M.O., Computable algebra, general theory and the theory of computable fields, Transactions American Mathematical Society, 95 (1960) 341-360. [i13 ROSEN, B.K., Tree manipulating systems and Church-Rosser tion Computing Machinery, 20 (1973) 160-187.
theorems,
J. Associa-