plicable crashes. Many crashes can be traced to buffer overruns [14]. .... does not have a mechanism for dealing with possible changes to a buffer - only ..... Let P½ be an equilateral triangle, P¾ a hexagon, P¿ a dodecagon and so forth ..... does not bar overruns and any overrun has the potential of side-effecting a variable ...
Analyzing String Buers in C Axel Simon and Andy King Computing Laboratory, University of Kent, Canterbury, CT2 7NF, UK A buer overrun o
urs in a C program when input is read into a buer whose length ex eeds that of the buer. Overruns often lead to rashes and are a widespread form of se urity vulnerability. This paper des ribes an analysis for dete ting overruns before deployment whi h is
onservative in the sense that it lo ates every possible buer overrun. The paper details the subtle relationship between overrun analysis and pointer analysis and explains how buers an be modeled with a linear number of variables. As far as we know, the paper gives the rst formal a
ount of how this software and se urity problem an be ta kled with abstra t interpretation, setting it on a rm, mathemati al basis.
Abstra t.
1 Introdu tion Although C programs are ubiquitous, o
urring in many se urity- riti al, safety riti al and enterprise- riti al appli ations, they are parti ularly prone to inexpli able rashes. Many rashes an be tra ed to buer overruns [14℄. A buer overrun o
urs when input is read into a buer and the length of the input ex eeds the length of the buer. Buer overruns typi ally arise from unbounded string opies using sprintf(), str py() and str at(), as well as loops that manipulate strings without an expli it length he k in the loop invariant [15℄. Un he ked input whi h over ows a buer an overwrite portions of the sta k frame. Ha kers have exploited this ee t to redire t the instru tion pointer in the sta k frame to mali ious ode within the string and thereby gain partial or total ontrol of a host [17℄. Buer overruns are a parti ularly widespread lass of se urity vulnerability [5℄ and the National Se urity Agen y predi ts that overrun atta ks will ontinue to be a problem for at least the next de ade [20℄. Finding se urity faults, su h as string handling that may overrun, has been likened to the problem of nding a needle in a hay-sta k [11℄. This paper des ribes an abstra t interpretation s heme that does not attempt to nd the needle, but rather aims to prune the hay-sta k down so that it an be sear hed manually.
1.1 Design spa e for overrun analysis One (surprising) tradeo that is applied in program analysis is that of sa ri ing soundness for tra tability [24℄. In the ontext of dete ting overruns, one potential tradeo in the design-spa e is that of a hieving simpli ity by, say ignoring aliasing, at the ost of possibly missing real errors and possibly reporting spurious errors. Whether this is a
eptable depends on whether the analysis aims
to dete t most errors [12, 13, 23℄ (debugging) or dete t every error [8℄ (veri ation). One point in this design-spa e is represented by LCLint. LCLint is an annotation-assisted stati he king tool founded on the philosophy that it is ne to \a
ept a solution that is both unsound and in omplete" [12℄. This enables, for example, loops to be analyzed straightforwardly using heuristi s. Its annotation me hanism a hieves both ompositionality and s alability. However, as [12, 13℄ point out, it seems optimisti to believe that a programmer will add annotations to their programs to dete t overrun vulnerabilities. Another point in the design spa e is the onstraint based analysis of [23℄ whi h an dete t potential overruns in unannotated (lega y) C ode. This analysis ignores pointer arithmeti and is therefore unsound, but it is fast enough to reason about medium-sized appli ations without assistan e. For speed and simpli ity, the analysis is ow insensitive whi h means that it annot reason about (non-idempotent) library fun tions su h as str at() whi h are a frequent sour e of overruns [15℄. The work of Dor, Rodeh and Sagiv [8℄ is unique in that it attempts to formulate an analysis that is onservative in the sense that it never misses a potential overrun. This is a parti ularly laudable goal in the ontext of se urity where there is a history of elite ha kers levering in onspi uous and inno uous features into major se urity holes. Our work builds on the foundation of Dor et al [8℄ and is a systemati examination of the abstra tions and analyses ne essary to reason about overruns in a truly onservatively way.
1.2 Conservative overrun analysis The buer abstra tion of Dor et al [8℄ is essentially that of Wagner [23℄ but spe i ed as a program transformation. Ea h buer is abstra ted by its size, allo , and the position of the null hara ter, len . This two-variable model, however, is not ri h enough to a
urately tra k the null position. Suppose, for example, if two pointers p and q point to dierent positions within the same buer, then altering the len attribute of p (by writing a null hara ter to the buer) might alter the len attribute of q . Dor et al [8℄ therefore introdu e spe ial p overlaps q variables to quantify the pointer oset and thereby model string length intera tion. This ta ti potentially in reases the number of variables quadrati ally whi h is unfortunate sin e relational numeri abstra tions su h as polyhedra [4℄ be ome less tra table as the number of variables in rease. This is an eÆ ien y issue. More subtle is the orre tness issue that relates to pointers that de nitely or possibly point to the same buer. This is illustrated in the following ode: 1 2 3 4 5 6 7
har *p, *q, s[32℄, t[32℄; str py(s,"Boat "); /* str py(t,"Aero"); /* p = t+4; /* str at(p, "plane "); /* if (rand()) q=s; else q=t;/* str at(q,"to R eunion"); /*
s[5℄ */ s[5℄ t[4℄ */ p[4℄ s[5℄ t[4℄ */ p[10℄ s[5℄ t[10℄ */ p[10℄ q[5,10℄ s[5℄ t[10℄ */ p[10,20℄ q[15,20℄ s[5,15℄ t[10,20℄*/
The omments indi ate the possible null positions at the various lines. The str at in line 7 either updates the s buer or the t buer (but not both) depending on the random value. The transformation des ribed by Dor et al [8℄ does not have a me hanism for dealing with possible hanges to a buer { only de nite hanges are tra ked. In parti ular the transformation does not orre tly re e t that the null is either at position 5 or 15 in the s buer. Our remedy to this is to infer that q possibly shares with s and t so that both 5 and 15 are possible null positions for s and symmetri ally both 10 and 20 are potential null positions for t. Thus possible sharing information is used as an aid to orre tness. Conversely, de nite sharing information is used to in rease pre ision. It enables a write to a buer through one pointer to destru tively update the null positions for those pointers that de nitely share the same buer. For example, the str at at line 5 hanges the null position of the t buer from 4 to 10. The information that t and p de nitely share is needed to infer that the null position in t is no longer 4. Moreover, possible sharing is also required to dedu e that no other buers are ae ted. To summarize, our work makes the following ontributions:
{ It gives the rst formal a
ount of buer overrun analysis. A buer abstra -
tion map is proposed that spe i es how to model a buer. This abstra tion is then used to formalize the orre tness of the analysis. This means that a programmer an be on dent that every possible overrun is dete ted. { It shows that any string buer overrun analysis that is both a
urate and
orre t needs to reason about pointer aliasing. Inter-pro edural points-to analysis [1, 21℄ is required to as ertain whi h pointers possibly point to the same buer. The paper also shows how to improve pre ision by employing a de nite sharing analysis [9℄. { It explains how buers an be modeled with three variables per pointer and shows how the need for p overlaps q is nessed by sharing analysis, thereby avoiding quadrati blowup. The paper thus des ribes a pra ti al foundation for analyzing string buers.
Se tion 2 introdu es the language String C to formalize operations on string buers. Se tion 3 explains how polyhedra and sharing domains an be used to des ribe buer properties, and then Se tion 4 explains how these properties an be tra ked by analysis. Se tions 5 and 6 and present the related and future work and Se tion 7 on ludes. Proofs are omitted be ause of la k of spa e.
2
The String C language and its semanti s
The analysis is formulated in terms of a C-like mini language alled String C that expresses the essential operations on buers.
2.1 Abstra t syntax Let (n 2) N denote the set of non-negative integers and (l 2) Lab denote a set of labels whi h mark program points. Let (x; y 2) X and (f 2) F denote the
[skip℄ ` hskip; i! [num℄ ` hx = n; i!:[(x) 7! n℄ [str℄
` hx = "n1 : : : n "; 1 i! 2 :[(x) 7! ha; a; a + m + 1i℄ m
[var℄ ` hx = y; i!:[(x) 7! :(y)℄ [add℄ ` hx = y1 + y2 ; i!:[(x) 7! v℄ [sub℄ ` hx = y1 y2 ; i!:[(x) 7! v℄
if n 2 N if n1 ; : : : ; nm 2 N ^ [a; a + m + 1℄ \ dom(m1 ) =1 ; ^ 2 = 1 :[a + i 7! ni+1 ℄i=0 : [a + m 7! 0℄
if :(yi ) = vi ^ v1 v2 = v if :(yi ) = vi ^ v1 v2 = v if 1 :(yi ) = vi 2 if ab a < ae ^ v1 v2 = hab; a; ae i [arr1 ℄ ` hx = y1 [y2 ℄; 1 i! err otherwise ^ 2 = 1 :[(x) 7! 1 (a)℄ if 1 :(yi ) = vi ^ 1 :(x) 2 N if ab a < ae [arr2 ℄ ` hy1 [y2 ℄ = x; 1 i! 2 ^ v1 v2 = hab ; a; ae i err otherwise ^ 2 = 1 :[a 7! 1 :(x)℄ if :(y) = v ^ [b; b + v℄ \ dom() = ; [mallo ℄ ` hx = mallo (y); 1 i!1 :2 :3 ^ 2 = [b + i 7! rand()℄vi=0 ^ 3 = [:(x) 7! hb; b; b + vi℄ Con rete semanti s for simple statements. The fun tion rand() returns random values and models mallo 's behavior to allo ate uninitialized memory Fig. 1.
( nite) sets of variables and fun tion names o
urring in a program. F in ludes the symbol main to indi ate the program entry point and ea h fun tion f 2 F has an asso iated variable fr 2 X to return values from fun tions in Pas al-style.
C ::=f (x1 ; : : : ; xn )L; C j " L ::=[skip℄l j [x = n℄l j [x = "n1 : : : nm "℄l j [x = y℄l j [x = y1 + y2 ℄l j[x = y1 y2 ℄l j [x = y1 [y2 ℄℄l j [y1 [y2 ℄ = x℄l j [x = mallo (y)℄l j[if x then L1 else L2 ℄l j [while x L℄l j [L1 ; L2 ℄l j [return℄l j [x = f (y1 ; : : : yn )℄l The language L(C ) de nes the set of valid String C programs. Given a xed program, the map : F ! ([i2N X i ) L(L) retrieves the formal arguments hx1 ; : : : ; xn i and body L for a fun tion f . String C does not distinguish between
integers and unsigned hars be ause reasoning about buers of dierent sized obje ts in reases the number of ases in the abstra t semanti s and obs ures the underlying ideas. String C an be enri hed following the ideas detailed in [1, 18℄.
2.2 Instrumented operational semanti s The semanti s of String C instruments ea h pointer with details of its underlying buer. Spe i ally, the set of pointer values is de ned as Pnt = fhab ; a; ae i 2 N 3 j ab < ae g. The interpretation of a triple hab ; a; ae i is that the ab and ae re ord the rst lo ation within the buer and the rst lo ation past the end of the buer. The address a re ords the urrent position of the pointer within the
[if1 ℄ ` hif x then L1 else L2 ; i!hL1 ; i if :(x) 6= 0 [if2 ℄ ` hif x then L1 else L2 ; i!hL2 ; i if :(x) = 0 [while1 ℄ ` hwhile x L; i!hL; while x L; i if :(x) 6= 0 [while2 ℄ ` hwhile x L; i! if :(x) = 0 ` hL1 ; 1 i!hL2 ; 2 i [seq1 ℄ ` hL1 ; L3 ; 1 i!hL2 ; L3 ; 2 i ` hL1 ; 1 i! 2 [seq2 ℄ ` hL1 ; L2 ; 1 i!hL2 ; 2 i 2 ` hL1 ; 1 i! hL2 ; 2 i [env1 ℄ 1 ` henv 2 in L1 ; 1 i!henv 2 in L2 ; 1 i 2 ` hL; 1 i! 2 [env2 ℄ 1 ` henv 2 in L; 1 i!2 [ret℄ ` hreturn; L; i! if (f ) = hz1 ; : : : ; zn ; Li 1 ` hx = f (y1 ; : : : ; yn ); 1 i! ^ 2 = [zi 7! ai ℄ni=1 [ all℄ henv 1 :2 :[fr 7! (x)℄ in L; 2 :1 i ^ dom(1) \ rng(2 ) = n; ^ 2 = [ai 7! 1 :1 (yi )℄i=1 Fig. 2.
Con rete semanti s for ontrol statements
buer. For a valid buer a
ess ab a < ae must hold. The set of values is then de ned as Val = N [ Pnt whereas the set of environment and store maps are de ned Env = X ! N and Str = N ! Val respe tively. The following (partial) maps formalize address arithmeti .
De nition 1. The fun tions ; : Val 2 ! Val are de ned as follows:
i j = i+j i hbb ; b; be i = hbb ; b + i; be i hab ; a; aei j = hab ; a + j; ae i hab ; a; aei hbb ; b; be i = ?
i j = i j i hbb ; b; be i = ? hab ; a; ae i j = hab ; a j; ae i hab ; a; ae i hbb ; b; be i = a b
Figures 1 and 2 detail the operational semanti s for simple statements and the
ontrol statements. In Figure 2, dom(f ) and rng(f ) denote the domain and range of a mapping f and : denotes fun tion omposition de ned su h that g:f (x) = g(f (x)). The env statement introdu ed in Figure 2 models s oping.
3 Abstra t domains 3.1 Linear inequality (polyhedral) domain Let Y denote the variables fyP1 ; : : : ; yn g, let LinY denote the set of (possibly rearranged) linear P equalities ni=1 mi yiP= m and (possibly rearranged) nonstri t inequalities ni=1 mi yi m and ni=1 mi yi m where m; mi 2 Z. Let EqnY denote all nite subsets of LinY . Note that although ea h set E 2 EqnY P is nite, EqnY is not nite. De ne [ e℄ = fhx1 ; : : : ; xn i 2 R n j ni=1 mi xi } mg
where } 2 f; =; g. Then de ne [ E ℄ to be the polyhedron [ E ℄ = \f[ e℄ j e 2 Eg if E 2 EqnY . EqnY is ordered by entailment, that is, E1 j= E2 i [ E1 ℄ [ E2 ℄ . Equivalen e on EqnY is de ned E1 E2 i E1 j= E2 and E2 j= E1 . Let Poly Y = EqnY = . Poly Y inherits entailment j= from EqnY . In fa t hPoly Y ; j=; u; ti is a latti e (rather that a omplete latti e) with [E1 ℄ u [E2℄ = [E1 [ E2 ℄ and [E1 ℄ t [E2 ℄ = [E ℄ where [ E ℄ = l( onv ([[E1 ℄ [ [ E2 ℄ )) and
l(S ) and onv(S ) denote the losure and onvex hull of a set S 2 R n respe tively [19℄. Note that in general onv ([[E1 ℄ [ [ E2 ℄ ) is not losed and therefore annot be des ribed by a system of non-stri t linear inequalities as is illustrated below.
Example 1. Let Y = fx; yg, E1 = fx = 0; y = 1g and E2 = f0 x; x y = 0g so that [ E1 ℄ = fh0; 1ig and [ E2 ℄ = fhx; yi j 0 x ^ x = yg. These polyhedra are illustrated in the left-hand graph. Then onv ([[E1 ℄ [ [ E2 ℄ ) in ludes the point h0; 1i but not the ray fhx; yi j 0 x ^ x + 1 = yg and hen e is not losed. This
onvex spa e is depi ted in the right-hand graph.
y
3
E
6
[[ 2 ℄℄
2
3
6
2
[[E1 ℄℄1 r 0
y
0
1
2
3
-x
1 r 0
0
1
2
3
-x
It is useful to augment u and t with three operations: proje tion, minimization and maximization. The minimization and maximization operations are respe P tively de ned min(hm1 ; : : : ; mn i; [E ℄ ) = minf ni=1 mi xi j hx1 ; : : : ; xn i 2 [ E ℄ g and max(hm The ve tor hm1 ; : : : ; mn i is 1 ; : : : ; mn i; [E ℄ ) is de ned analogously. P Pn written as niP =1 mi yi for brevity. Note that min( i =1 mi yi ; [E ℄ ) 2 R [ f 1g whereas max( ni=1 mi yi ; [E ℄ ) 2 R [ f+1g. Proje tion is de ned 9xi ([E ℄ ) = [E 0 ℄ where [ E 0 ℄ = fhx1 ; : : : ; xi 1 ; x; xi+1 ; : : : ; xn i j x 2 R ^hx1 ; : : : ; xn i 2 [ E ℄ g. For brevity, let 9yi1 ;:::;yim ([E ℄ ) = 9yim (: : : 9yi1 ([E ℄ ) : : :). The variable set for [E ℄ is de ned var([E ℄ ) = \fvar(E 0 ) j E E 0 g where var(E 0 ) is the set of variables (synta ti ally) o
urring in E 0 . Let false = [f0 = 1g℄ so that [ false℄ = ;. Proje tion an be omputed using the Fourier algorithm [3℄, min and max using Simplex [3℄, u is straightforward to ompute whereas t an either be
al ulated using the point, ray andPline representation [4℄ or by using onstraint relaxation te hniques [7℄. Let s = ni=1 mi yi . Then entailment an be tested by: P j= fs < mg i max(s; P ) < m (even though fs < mg is stri t); P j= fs mg i max(s; P ) m; P j= fs = mg i P j= fs mg and P j= fs mg. Moreover, P j= fe1 ; : : : ; ek g i P j= fe1 g . . . P j= fek g. Hen eforth, whenever unambiguous, [E ℄ will be written as E for brevity. Finally note that Poly Y does not satisfy the as ending hain ondition, that is, hains P1 j= P2 j= : : : exist for whi h ti>0 Pi does not exist. Example 2. Let P1 be an equilateral triangle, P2 a hexagon, P3 a dode agon and so forth su h that the verti es of Pi are ontained within those of Pi+1 . Then P1 j= P2 j= P3 : : : is an as ending hain whi h onverges onto a ir le whi h
itself is not a polyhedron. Thus widening is required to enfor e termination in polyhedral x-point al ulations [4℄.
3.2 Polyhedral buer domain
Xn , Xo and Xs denote sets of variables su h that x 2 X i xn 2 Xn , xo 2 Xo and xs 2 Xs and suppose that X , Xn , Xo and Xs are pair-wise disjoint. Let PB X = Poly X [X [X [X . The latti e hPB X ; j=; u; ti is used to des ribe
Let
n
o
s
numeri buer properties salient to overrun analysis. In parti ular, suppose an environment maps the variable x 2 X to the address (x) = b and a store maps b to a pointer triple (b) = hab ; a; ae i. Then the variable xs 2 Xs des ribes the number of lo ations between ab and ae (the size of the underlying buer); xo 2 Xo represents the oset of a relative to ab (the position of x within the buer); and xn 2 Xn aptures the rst lo ation between ab and ae whose
ontents is zero (the position of the null when the buer is interpreted as a string). If P 2 PB X des ribes the pair h; i, then P aptures the numeri relationships between xs , xo and xn . This idea is formalized below.
De nition 2. The polyhedral on retization map XPB : PB X ! }(Env Str) PB (P ) = fh; i j s (h; i) u bf (h; i) j= Pg where is de ned X X X
Xs (h; i) = fx = :(x) jx 2 X ^ :(x) 2 N g 9 8 ab ; x 2 X ^ :(x) = hab ; a; ae i ^ = < xo = a Xbf (h; i) = : xs = ae ab ; b = min(faeg [ fn 2 [ab ; ae 1℄ j (n) = 0g) ; xn = b a b If zero does not o
ur within the buer then f n 2 N j ab n < ae ^ (n) = 0g = ; so the fae g element is used to ensure the map is well-de ned. It also re ords the de nite absen e of a zero (and thereby enables ertain de nite errors PB : }(Env Str) ! PB X annot be to be found). An abstra tion map X PB s synthesized from X sin e PB X is not omplete. In parti ular, tfX (h; i) u bf X (h; i)jh; i 2 Mg is not well de ned for arbitrary M 2 }(Env Str). To put it another way, M does not ne essarily have a best polyhedral abstra tion. Tra king the rst zero (rather than, say, every zero) keeps the number of variables in the model low whi h aids simpli ity and omputational eÆ ien y. However, there are unusual ombinations of string operations where just tra ing the rst null an lead to a loss of pre ision and hen e false warnings. Example 3. Consider the following (syntheti ) C program that zeros the buer lo ated by s, then and writes the buer hara ter by hara ter.
har *s = mallo (10), t[4℄; memset(s,0,10); s[0℄='O'; s[1℄='k'; str py(t, s);
The all to memset will set sn = 0, i.e. the rst null position is 0. After the rst write the analysis an only dedu e sn 1 sin e the rst null (if it exists) must o
ur to the right of the 'O'. Likewise after the se ond write the analysis infers sn 2. The write to the 4 hara ter buer t is safe if sn 3. However sn 2 does not imply sn 3 and therefore the all to str py generates a spurious warning. Inserting s[2℄='a'; s[3℄='y'; in front of the all to str py will yield a de nite error sin e sn 4 implies that sn 3 annot be satis ed. On the other hand, writing the same four hara ters to the same positions in reverse order only leads to a warning: sn = 0 holds valid until s[0℄ is written whi h then updates the null position to sn 1.
3.3 Possible and de nite buer sharing domains
Let PS X = }(X 2 ) and DS X = }(X 2 ). The omplete latti es hPS X ; ; \; [i and hDS X ; ; [; \i are used to express (pair-wise) possible buer sharing and de nite buer sharing respe tively. De nition 3. The abstra tion maps XPS : }(Env Str) ! PS X and XDS : }(Env Str) ! DS X and on retization maps XPS : PS X ! }(Env Str) DS : DS X ! }(Env Str) are de ned as follows: and X XPS (M ) = [fXsh(h; i) j h; i 2 Mg XDS (M ) = \fXsh(h; i) j h; i 2 Mg
PS (S ) = fh; i j sh(h; i) Sg DS (D) = fh; i j D sh (h; i)g X
where
X
X
X
h; i)= fhx; yi 2 X j ((x)) = hab ; ax ; ae i^((y)) = hab ; ay ; ae ig. The intuition is that ea h D 2 DS X aptures the ertain presen e of sharing in sh (h; i) for all h; i 2 DS (D ). Conversely, that if hx; yi 2 D then hx; yi 2 X X ea h S 2 PS X des ribes the ertain la k of sharing, that is, if hx; yi 62 S then hx; yi 62 Xsh(h; i) for all h; i 2 XPS (S ). This is the negative interpretation of S . The positive interpretation of S is that S des ribes the possible present PS (S ) su h that of sharing, that is, if hx; yi 2 S then there exists h; i 2 X sh hx; yi 2 X (h; i). Proposition 1. XP S and XP S are monotoni ; XP S ( XPS (S )) S for all S 2 PS X ; and M XP S (XP S (M )) for all M 2 }(Env Str). Sin e }(EnvStr) and PS X are omplete latti es, it follows that the quadruple h}(Env Str); XP S ; PS X ; XP S i is a Galois onne tion between }(Env Str) DS ; DS ; DS i is also a Galois onne tion. and PS X . Likewise h}(Env Str); X X X sh X(
2
The overrun analysis presented in this paper pre-supposes possible and definite buer sharing information. The inter-pro edural ow-insensitive points-to analysis of [21℄ an be adapted to derive the former whereas intra-pro edural analysis is likely to be suÆ ient for the latter. The sharing abstra tions S l and Dl are introdu ed to abstra t away from a parti ular sharing analysis. De nition 4. S l and Dl are de ned so that h; 2 i 2 XPS (S l ) \ XDS (Dl ) if ` hx = main(); 1 i ! hLl ; 2 i where = [x 7! 0℄, 1 = [0 7! n℄ and n 2 N .
3.4 Domain intera tion A polyhedral abstra tion an be used to re ne a possible sharing abstra tion whereas a de nite sharing abstra tion an improve a polyhedral abstra tion.
De nition 5. The operators %D : PB X ! PB X and %P : PS X de ned %D (P ) = P u fxn = yn ; xs = ys j hx; yi 2 Dg and
%P (S ) = S n hx; yi 2 S PP jj== xxsn > err else if P j= fyo + z ys g > xn ;xo ;xs (P ) fx = 0g else if P j= fyo + z = yn g > if P j= fyo + z < yn g > :99xx;xn ;xo;x;xs;x(P(P) ) fx 1g else otherwise 8 n o s err if P j= fyo + z < 0g > :P 00 fvi = yn j vi 2 V g otherwise [mallo 0 ℄ hx = mallo (y); P i!0 9x (P ) fxn 0; xo = 0; xs = yg Abstra t semanti s for simple statements where W = fwn j hw; yi 2 Dl g, l 00 = % l (update (9W nfy g (P 0 )). n j hv; y i 2 %P 0 (S )gn W and P D D x;y;z n
Fig. 3.
P 0 = % l (P ), V = fv
Proposition 3. Let : f1; : : : ; kg!f1; : : : ; kg be a permutation, ei = fxi }i si g, si = mi + jn=1 mi;j yi;j and xi 62 var(ej ) for all i 6= j . Then P he1 ; : : : ; ek i = P he(1) ; : : : ; e(k)i and P he1 ; : : : ; ek i = P he(1) ; : : : ; e(k) i. For brevity, de ne P fe1 ; : : : ; ek g = P he1 ; : : : ; ek i if xi 62 var(ej ) for all i 6= j . Proposition 3 ensures that P fe1 ; : : : ; ek g is well-de ned whereas Proposition 4 explains how an be used to approximate when a polyhedra an be updated i
with dierent ombinations of equations (and possibly not at all).
Proposition 4. P E 0 j= P E for all E 0 E . 4.2 Non-array statements To onstru t abstra t versions of the non-array statements, the s and bf maps are introdu ed to dete t whether an obje t is a s alar or a pointer. The following
proposition states an invariant of any orre t analysis: a polyhedra annot simultaneously re ord s alar and buer attributes for a given variable. The lemma then asserts orre tness for the abstra t semanti s of the non-array statements. De nition 7. The maps s : PB X ! }(X ) and bf : PB X ! }(X ) are de ned s (P ) = X \ var(P ) and bf(P ) = fx 2 X j fxn ; xo ; xs g \ var(P ) 6= ;g. Proposition 5. If s (P ) \ bf(P ) 6= ; then XPB (P ) = ;. Lemma 1. Suppose L = [skip℄l j : : : j [x = y z ℄l . Then if ` hL; 1 i ! 2 , h; 1 i 2 XPB (P ) and hL; P1i!0 P2 it follows that h; 2 i 2 XPB (P2 ).
4.3 Array statements To reason about array statements that an error, the store is extended to EStr = Str [ ferrg where err denotes an error state. Likewise, the polyhedral domain is extended to EPB X = PB X [ f?; err; warng. hEPB X ; ; f; gi is a latti e where the ordering is de ned by ? P and P warn for all P 2 EPB X whereas for all Pi 2 PB X , P1 P2 i P1 j= P2 . Moreover, f and g are given by:
P1 t P2 if P1 ; P2 2 PB X P1 u P2 if P1 ; P2 2 PB X > > > > err else if P1 = P2 = err < err else if P1 = P2 = err P1 f P2 = > P1 else if P2 = warn P1 g P2 = > P1 else if P2 = ? > > P2 else if P1 = ? P2 else if P1 = warn > > > > : : warn otherwise ? otherwise 8 > > > >
yn g > A > > > P else if P j= fx > 0; yo + z < yn g > A > > > P else if P j= fx = 0; yo + z < yn g > B > > > P else if P j= fyo + z < yn g > C > < PD else if P j= fx > 0g ^ wu = nl = wl PB else if P j= fx = 0g ^ wu = nl = wl > > > > P > E else if P j= fx > 0g > > > P > A else if P j= fx = 0g ^ nl wl nu < wu > > > P else if P j= fx = 0g > F > : PG otherwise
PA = P PB = P (yn = yo + z) PC = P (yn = yo + z) PD = P (yn yo + z + 1) PE = P (yn yn ) PF = PC u fyn minfnu; wu gg PG = P (yn yo + z) wl = dmin(yo + z; P )e wu = bmax(yo + z; P ) nl = dmin(yn ; P )e nu = bmax(yn ; P )
P
j= y
o
P
+z
o
0 P
j= y pn
wl
+z
>
B
otherwise
Fig. 4.
=n
l
pn
nl
=w
u
wl
w
l
n
w
u