Algebra-based identification of tree patterns in ... - Semantic Scholar

2 downloads 0 Views 225KB Size Report
or. Cont,the pattern will contain some null (. ⊥) values to account for the missing children descendents. As previously mentioned,patterns play a dual role in our ...
Algebra-based identification of tree patterns in XQuery     1

1,2

      2  !" # %$'&  1  (   $)*+ $-,% ,.   3 0/1 2   3- 4 1,4

INRIA Futurs, Gemo group, France [email protected] 2 LRI, Univ. Paris 11, France [email protected] 3 CSE Dept., UCSD, USA [email protected] 4 IIT Bombay, India [email protected]

Abstract. Query processing performance in XML databases can be greatly enhanced by the usage of materialized views whose content has been stored in the database. This requires a method for identifying query subexpressions matching the views, a process known as view-based query rewriting. This process is quite complex for relational databases, and all the more daunting on XML databases. Current XML materialized view proposals are based on tree patterns, since query navigation is conceptually close to such patterns. However, the existing algorithms for extracting tree patterns from XQuery do not detect patterns across nested query blocks. Thus, complex, useful tree pattern views may be missed by the rewriting algorithm. We present a novel tree pattern extraction algorithm from XQuery queries, able to identify larger patterns than previous methods. Our algorithm has been implemented in an XML database prototype [5]. We study its performance, and the overall benefits of our tree pattern identification approch.

1

Introduction

75 6 98:;4?@>?BA CD.EF $G& - ,.# 4H>   >H  +I,' "@$, 6 9$J,%K;4?@>?LM  8N"OQPIR9+S;LM ?TB& %U;  6 & >V,'& 6  IW 8:;4H+ I& %$$' >V $X, 6 7$.@>?W LFTB,.   # U   B2 .Y9$P 516 Z ! $G,. +-%U;& ?TH+I,'[K$-,'  B, 6 Z,%\@$', 6 Z%$'# ,9 L]$ T!I  %$ ^ & ?THTH # 4&# #  

Y 6 H$' XI J4!- 2%$ ,.   ?,' L`4Y 6  & 6 +J,'$ @LX, 6 I J4aTB,.& view @LX, 6 b+-%Uc&_ ?TH+I,'2 .Y9$P 576 #=->?+J,$Z LQ, 6 I J4d & 6 Vdefinitions TH,.& 6 Y9 , 6 B2 %Y , 6 HTH .e&   ?,;4f+ &%$$ >YW # #\S $' &  >? b+J,N LG, 6  ;4K& ?TH+I,%,' h$%LM#2 .Y9$bLM VI J4i I %$V$  >i\j I,!& ? ,% T! , ^  P >P  $V# # , 6 H,. d2 .Y &  ,%  k d, 6 V%$'# ,b L7;4 h 2?#  & ^  P >P  $Z, 6 W3J ?  @L2 %Y1$  v  I 2  #  ,b,. H, 6 I J4 Pj8N"Oqkl@_I J4d ?,% T! ,a I 2?# &  -Y7 # #Qv1 -$J,. v I2hY 6  d2 .Y9$b h- %$ql@_ -+- $ ,. f$

& ? ,%  > ,'+# %$Q @LS # T! ,'$F$'@,' $-L`4 >[$+j &  m&$-,'&.,.'#? #=,. $ 6  +$WA Dn Dotree EcP patterns  2 +S +#=*8NdO  'p >[ LM'@>TH ?,%,' Z$J,.',' >? %$* # $' WTH@,'-=#  ],' X+,,. $qA Dr D D D s D%t EuP 516  .Lv - ,' W+,,. $9K ?,'- $J,. >HT! I #LM q8N"OaTB,.   #   H2 %Y1$bA s -w n D D DCEcP RXY7 ?-V $G+#=& h$'& - #=,' $  +$H  &.,.# 4f $VT e2& TH V.e&   , 6 6 6 6 6 , 6 !%2 >,. >ZLM ?T ,'  b, 6 q,%\@$'7,' W2  L`4b ,@P 1T!- $Q$-,'&.,. #  +- ?+S $.# $ 6 %2IZ\S TB9$' NLve 1$' e?2P >PFA C D @EcP P 576    $1TB 4 \I,9 V @,9  H,.We 6  assume 2Z$J,.&.,.persistent # + +S;,.IDs  %$P are available in the store Rb$' & ? $-$'T!+I,.   $b, 6 ,HaTB,'-=#  h2 .Y TB 4a$-,. - ^ A Dr DC ^ vP ?P , 6 B,''p,B  $B  &.,.# 4 H  # T!  , i _ node D%tE ^   b, 6 BIDs 2?# 0 @LZ ,-,. \iiI,._  A Dr@E values   ^  I

, 6 ,Q $ , 6 QLM# #$\I,. q  ,'N,X[8N"OH # T! , ^ 7b+S  ?_ ,'q,. N, 6 @,7$'iii \I,'_   A nEcP content 516  $9$-$'T!+I,. K+ 2 %$1LM qzpS \# 92 .Y >?'#= ,;4P _ 5 ,% NI2??,.@>N L],.-Z+,,'-IUc$ 6 +SdTB,. =#   K, 6 N2 .Y patterns p , . . . , p q1 qn +,-,. $ P 576 VmJ$-,Z$-,'+ ^ I J4Uu,. @U;+,-,. d,.$'#=,. ?  $[&-&=#vP   ,. ,' 2# 4 _ ,' N-%Y9 ,'7, 6 T , 6 $ , 6 W\ > >?Xpv1 , 6 ,1.I.. ,Jp4Bvm +@,-,. $ , 6 W\ > > X, 6 72 .Y ^ $ , 6 ,7&B\SW$' !

_ , 6 #  $-$1& ?TH+I,.@,' $9-TH  !,. \SZ++#   0 H,' +K LF, 6 92 .Y9$P 576 N& ?,.- \I,. ?0 @L*, 6  $W++SW $9!+- 2?\# 40&  &.,# >?  , 6 T   ?,. L 4 >B,' [+,,'-$  "-  $pS+- $-$' " "V#=->?[8W:Z J4$\$'.,@P 516 b I2   ,%@>? @L*, 6  $TH., 6  29pS $-,. > %$aA n

 D EZ $V, 6 ,!, 6 +,,'-$HY7  ,. L`4 -?, 6    +.2 $BY7 ?$  K+J,' & #=WTB 4H$'+ 2 1%$-,. B8: J4\# &.$ Y 6  & 6 Y1$1 @,1, 6 Z&$[ K+.2 $ ++- & 6 %$PIyk1>?- ?K G# >?  , 6 T B B# >?\ $' &  ^ @$Y71YW # #$ 6 Y , 6 1,.$'#=,. ?  $I ,'b& ?T!+# 'pdZ,. 8:;4d& ?TH+# pS ,J4 "$-,'' > 6 ,-LM ?;Y1 -0,''$'#=,' _ aTH., 6 $ZTB 4 # I $'W, 6 W$'\I,'# $'TB?,' &W #=,. $ 6  +$\j.,;Y7 Kb+,,'-KbI J4P 576 7++S* $X J>I  H@$ LM ?# # Y9$P  &.,. VCTH @,' 2?,.%$], 6 q NLM ?*+,,. -& >? ,. V  8W:Z J4NI   $P  &.,. ?Ns1$'.,'$F, 6 QLM 'TH#\&.>?- ?NLM F, 6 Q,.$'#=,. ?V# >?  , 6 T +%$' ?,'    &.,. ?P?D LM@,' >W, 6   $J,. [LM ?;UuY 6  .U;-%,'N\# I&.$P@b=T!+j ;,%?, 8: J4ZLv,.- ,' h  +   TH   $V, 6 ,HY 6  a-%,' & #=$'"& $-,'&.,'$B%Y # T! ,'$  LN 'p+%$$' ? LM K V, 6 W # T! ,q& ?$J,.-&%,' ?G.2?# ,.%$G,' ^ , 6 WTH+I,;4V%$'# , B # T! ,GT$-,q$J,. # #j\S & $-,'&.,.  # \j  ,*YW , 6  W&  ,. ?,Q& - $+j ∅ >,. 1, 6 ,Q+;,. & #=_ QpS+- $-$' xP ?  $J,%&    ] >PD  LLM ? $' ?T!\  >[ L, 6 2  -=\#   , 6 q'p+%$$   4  # $TH+I,;4 - $# ,  # TH ?,QYW # #$-,' # #S\j1& $-,'&.,. $x  Y9 , 6 $y N& ? ,. ?,7& ?-%$'+S$x  c>,.  ^ \I, res1 +S  6 +$1YW , 6 $ T!W&  ,. ?,1+ I&  0\ 4H, 6  $J,. BLv ;UuY 6  .U;.,.-SPID7 + &.,'$q # .2I V+j $-$' \# 1& - $+j  >bI J4b,' G+,-,. $PX& 6 +,-,. N $  I @,'[,F, 6  $-4T\j #  @,' >9, 6 X &T! ?,]  @,@P@)*,,'-Z >? $*TH 4\S#=\j #   Lv Z+ ?,Uc& 6  # a-#=,'> $ 6  +$ ?   HLM [&  $J,. JUc $&  ,-#=,' $ 6  +$P*)*,,'-d  $ /TH 4"\j!#=\S#  aYW , 6  IH?T!  $Z WYW , 6 ^ ?4h T! PSy 6 f!,' [, 6  ID contain the ID ^ %$'+S&.,. 2I # 4 +,-,. B  ?$' TH #=-# 4  LF[+,,'-B IW $7#  \S # 

, 6 9+,,. ! $ ^ - $+j &%,' 2# 4 , 6 92?#Cont $' H,.  @LF8NdO0 I%$9& V ?-al %$'_ +S  >!,' N, 6  contain the contents _

for $x in doc("in.xml")//a/*, $y in doc("in.xml")//b return { $x//c, for $z in $y//d return { $y//e, for $t in $z//f where $t[g/text() = 5] return { $t//h } }

V8

V1

a

V2

b ID

V3

V4

c Cont

d ID

V10

* ID

e Cont

f ID

V11

g Val=5

V7

h Cont

V9 a

}

* ID d ID

V6

V5

e Cont

f ID

* ID

g Val=5 h Cont

b ID c Cont d ID

e Cont

f ID g Val=5

h Cont

Fig. 1. Sample XQuery query and corresponding tree patterns/views.

+,-,.  I P  LQV+@,-,. 0 Ib $ ,.,. Y9 , 6  +- &,. Lv 1$ ?THZ& $-,.?, , 6  B # 4H8NdO"  $qY 6 $'92?# W$'@,' $-m $, 6 ,7+V al  &,.= ^cH, 6 9$-,.-&%,''#S& $-,''  ,'$ c H,  I YW # #x\S# >,. N, +@,-,. xP 6 6 ykq$-,' # #S_  V,' NpS+#= V, 6 WT! >N @LF@$ 6  W, 6 q++j  ^ + - , &%$-,'   7 @L optional W$ 6   >?WTB%4#=&.[8N"O %$'&  ?,$7TB,.& 6  >, 6 7# Y7  ^ & 6  #  %$'&   ,  I_  4.,Q, 6 ,q  TH 4$-,' # #]\j # >?WY1$9 ,.,. KY9 , 6 ID V al  Cont , 6 +,,'-BYW # # & ?,.  ?WLM ?2 %Y7U;\@$' 0;4K-%Y9 ,' > P _  .2#[pS $J,. > 2 .Y1U;\@$'  8NdO I J4 .YW- ,. > LM?T!%Y7 ?$fA n C s@EV& & ?,'',.h  8N)], 6 2 %Y1$ $J,.  >i,%"LM B +,,'-  I0 ?# 4 ^ $ & 8N)]@, 6 I   $ 6  2" 0.U ,'" I  d#=&.I > +I,. ?#*>?%$P   TH #=W pS%$N - $&- \Sa  A D r D tEcP     >P*D , 6 N # 4K8N_ )]@, 6 2 .Y9$Z U Y 6  & 6  +%$' ?,9, 6 b#=->?%$-,98N)], 6 +@,-,. $1, 6 , b&   2IGLv- ?T , 6 9;4!    V>1PD V7, 6 %4V$J,.  LM q# #S I%$qY 6  & 6 T$J,\SW.,.  ^ $'& 6 @$1, 6    I%$  LM ?1# #FCont I%$9Y 6 $'Z2?#  $9 ?1& ?,' ,ZZ @,  KLM  , 6 WI cJ4 e \I,GY h6  & 6 TV_ $J,G\j9,'ID ' 2 -$' B\ 4H, 6 WI J4 ^ $& 6 @$q, 6   I%$7%,'&?P  !, 6  $ &@$' , 6 Z # 4!Y1 4!,. V$-Y7 q, 6 WI J4K $q,' +j JLM 'Tgm2X3J  $WKabb&dJ,' $=?%$P 1 Y7.2 , 6 %$'V+,,'-$V @,V# # Y1 a,. K, 6 [& - $+j  >K2 %Y1$P 516  $$J,. # # - %$

TB 4H# B,. b-?\ &[$TH  ,. &Z @L*+@,-,. $7$'& 6 @$7, 6 $   ii]_  >PxD P LM 'TH # TH #x $9   ,.  $'-Z, 6 Z+,,'-$ 6  2Z'pS &.,.# 4 @$7I J4H$'\S'p+%$$ $ ? Y 6 B, 6  $7 $1 @,G, 6 &$ ,. b& ?TH+I,'W& ?TH+S $.,. >H&.,. $9  , 6 V2 .Y9$P  Z $-,. &  & $'     ] >PXD P 1- , 6  a, 6    $ -H +I,'  # %$'&   ,'$V @Lq, 6   I%$ a$'V 0,. B, 6 !eI J4P 1 Y7.2  ,' , 6 H;4i $J,. ib $J,.-&%,'   # TH ?,N$ 6 #  ++S h, 6 !%$'# ,  L1 ,'$ &  $J,.  b" $W @,   $ @, 6  2  $&  ,'$P 516  $ e +S &.4a $ @,pS+- $-$' d\ 4

d d → e V pS+- $-$' \# [\ 4P  # #  $&  ,'$N LQ, 6 N$'?T! T$-,Z\Sb I,.+I, ,' >., 6  9  ] >SPFD +@,-,. "TH # $9& $' -c" fA t D E @$1Y7# # @$ 7,'$x '$'#=,' aTH., 6   N,.?%$-,' >B ?_,. V& &  ,@P

V , 9 3-  $ 6

3 3.1

Data model, algebra, and query language Data model

xO ., \jW! Im ,'W# + 6 \S%,1 \S7,;Y7 [ $v3J  ?,G$'\$%,$G L P $+j &  # & $J,%?,   @A,.%$G, 6 9TH+I,;4V$-,' >P y q2 .Y L!I8NdO0 I& TH ?,7@$7 H'A  K#  \S # BA ?--V,.- P  4H I 6 @$G,.> & ?-%$'+S  >,. W, 6 9 # TH ?,q ,,' \I,'9 TH  HTB%4 6  2W2  # ?P G,,' \I,'WB # T! ,q TH%$q'>?9 2  Y 6  # 12?#  $G >?9 2I  P 516  @L] I \S # >?$b,' i $ \I,%  k\ 4h& ?&L,. ,. >0, 6 ,.p,& ?,' ,VA @L9# #& value 6  #  f L   n n  &T! , A   , 6 [- $# ,ZTB%4K\S  L], 6 [ I[ I%$ , 6  2Z,.p,9& 6  # -xP 516   content @L7B, 6 b,%H,.+#  $PSRW, 6 G$-,% +S@,' -$W-W, 6 [&rJ,' $=K+ I&., , 6   _ B, 6 $%,7 j  & ^ Y 6  & 6   @,9 # =T! ,.+#  &,.%$ P _ × ykW& $' ∪ 9+  &,.%$W @L], LM 'T \  6

Y 6 - c  $9b& $-,.?,@P θ >%$ A θ c A θ A i i j 2I , 6 & ?T!+ ,. -$

  ≺≺ # 4K ++# 4?  , 6 T I j- logical ,W# >?  structural , 6 T!$7& K\Sjoins %2 $' hA C @EcP Ox., \S0$ ?TH,. ?TH & ,,' \I,' $PX + @3J &%,'  \?4 .Lv# ,bA  1$[, A @2,[, ..#  .TH,  Ak,.+#  &,' $P  +# r &,'%U; # =T! ,. >0+- @3- &.,. π$A1,A2,...,Ak -b$' >?#  h(r) I,N\ 4 N$+j -$&- +I, $7  0 P 576 >?- ?+IUc\?4K +S,. 

 K%$-, uB Y 6 - B  $9 & # #  &%,' ,,.- \I,. π6  2W, 6 W$ #x$TH?,. & $ZA D.γEcP A ,A ,...,A ykG$'7, 6  T!%,.U; +j ',.  ,. Z%m9 # > \' &W +S,. -$XY 6  & 6 ++# 4  $J,.  ,'+# %$POx%, \jQmap GJ4W ?+S ',.  q& ?# # &.,. ?[,,' \I,' inside  r.A1 .A2 . . . . .Ak−1  $1b ;4K +S@,'   r.A1 .A2 . . . . .Ak K,' T! &op,,' \I,'?P 576 



1

2

map(op, r, A1 .A2 . . . . .Ak )

k



– –

L L

P

k = 1 map(op, r, A1 .A2 . . . . .Ak ) = op(r) LM 7.2;4H,'+#  k > 1 L FLM ?G.2I J4B& ?# # &.,. ? t ∈0 r  0

 $7#  TH ,. xP • R9, 6  JYW $' ",.+#  0 r $B∈-%t.A ,'1  map(op, \I,%  r ,Lv-A ?T2 . . . .\?.A 4 k )+= #=&∅ >kt .2I J4 & ?# # &.,. ? • t t YW , 6 P r0 ∈ r.A1 map(op, r0 , A2 . . . . .Ak )

 j $-,%&  # ., \SX $J,. [-# @,' xP 516   # 4B.,.-$X, 6 $' r(A ,.+#1 (A $ 11Lv ,QAY 126  &),6 A2 ) 2  # Z   $ ^ map(σ pS $-,. ?,'=5 =#j,$r,THA 1 .A ,. &%11 $ ) r t some t.A .A 5 1 11 V &  $], 6 %$',.+# %$q&&  >?# 4P ++#  %$X$' TH #=-# 4N,'  PX49$'#  > 6 ,q\$'_  M ap @L  @,%,.  Y7GYW # #.LM*,' @$ π γ u P  Q $-,%&  map(op, r, A 1 .A2 . . . . .Ak ) , 6 W$. TH+# W$#  &.,. \j 2IWYW # # \S @,.  P opA .A .....A (r) σA .A =5 (r)  J4< +j ',. J$9W$=TH #  -# 4B'p,.  2=

,. b%$-,'B,'+#  $ ^ .,% # $1 ?TH ,-,.  P ?+S ',. QYW+$G H +I,q,.+# 1map  ?,' $ >?# W+  &W @Lj8NdOK,''p, \ 4H>?# _  > 576  ^ $ ?THtempl ,' >., 6   xml 0 L  ,$,,' \I,' $  +j $$ \# 4  > ,%@> $ @$K$'+S&  m  \ 4 , 6 ,%@> > > ,'T!+#=,. P _ Z.2;4,.+#  Y 6 $'V,% 6 @$N #  I4d\S a>? +j ha$-,'&.,.- templ , 6 $ I,'+I,'$q 2?# qt Y 6  & 6  $*, 6 7& ?,' , @L, 6 G%Y9# 4!&-,' # TH ?,@P y 6  #  xml templ , 6  $] $ $'#  > 6 ,.# 4b j- , Lv- ?T A%Y  I& $-,'&.,.  ^ @$  I%$Q @,*&-,'1.Y     ,. ,;4 Y1N$'N , 6 -[LM 1$'=T!+#  &  ,;4"Y9 , 6 I,Z# $xml $ @LQtempl >  '#  ,;4P Q# TH ?,9& $J,.&.,. ? +S,. -_ $1&# $,. [8W:Z J4@,' K\'& 6 %$1 &# $B  TB%4! &# Z& ?TH+# p $ 6  2 > & 6  #  Y 6  #  .,.-$9, 6  %$'&  ?,$N @L \ a >?$P 516  $ $x c $x//b & #=@$$X&+I,.- $ 8N)], 6 'p+%$$ $q b, 6 G&@$'qY 6 b q, 6 G& ?,''p,q#  $J$x ,q $Q \I,%  !LM ?T relative $ ?TH12?  \# N\  > $Pyk  ,', 6 W$'.,9 @L*'p+%$$ $ ^ D   ^ C \S 2b@$ , 6 W$'.,1 L _ , 6  _ & &,',.P ?   ,' +, 6 pS+- $-$' $P  ?4N,JY7 Z'p+%$$ $ 

(3) e e ∈ Q 2  # TH ?,[& ?$J,.-&%,' ?J$b @L, LM 'T # $' \S# ?> $N,' P L  1 6



e1 , e2 \S# >V,. Q P(4) #t#x∈'p+L%$$ exp $W @LF, ∈6 9QLM ?# # YW >!LM 'T \S# >V,. hti{exp}h/ti Q (5) Q LM     Y 6 $x - 1 p1 $x2 p2 . . . ,$x  k pk xq pk+1 θ1 pk+2 ... pm−1 θl pm -%,' q(x1 , x2 , . . . , xk ) Y 6 -

  4 $-,%;,'$9  , 6 7LM- T , 6 [ I @, @L*$' ?THZ I&.U T! , p1 , p 2!,LM.. ?.T , pdk ,2?pk+1 =\,#  . . . , p  m,.- I∈P&    p, i6 0I J4 \j.LM  $' ?TH xl θ1,..;.4, θTHl  4? +Si $  6 -1%Uc,...,.. ,h & $-,'&.,. K# TH ?,$P 516 I J4-D # # $J,.',' $1 ?q$'++S J,'? \'N+-%2 $'# 4ipS+- $-$' $ $\jpS+- $-$' $7& -%$'+S ? >b,. W,. G+@,-,. $P 576 9# >? \' &9-#  $ -1 ,.1$J,.  > 6 ,LM JY1xP 576 ? H+,-,. h$'\S'p+%$$ $H $V,' ? \' &a8N)*, 6 - %$ def 0

Y 6    $B, 6  @,-,.- \I,.Lv- ?T , 6  -#=,'  alg(q) & %$'+S= πID  > ,. (π (f ull(q))) P   & d8N)*, 6 ID - $last # ,'$ ID e ret(q) d LM- ,.# 40 B,' V\jZ.,.- b$'   #   KLv T  P >P ,' V\j$ 6 Y9,' V$'  G$ ,W 0 yk \Z$;2 &  Y7X& $'   , 6  @,-,.- \I,. $ -# # 4N-%,'  , 6 $]$'  , 6 Q,. $#=,. xP C,  πC yk9 Y LM I& $9 V   # ? >    \      Z & M L     . & . ,  B  M L 7   +   ,       %  $ 6 6

  + >B ,. eq .ID f, ull(q) v L  ? T ,     # @  ' ,     ! ,   > ! ,  h  K    S \   2 BLM ?T#=P y 6   6 6 6  ,''$'# @,'  ?2e a# 2 $  $J,.0 @L P?y # $ 6  2 ≺

≺≺

Y 6 #

 $, 6  ,6 

ret(q)  $V +#=& ieaY9.ID , 6 // /



def

f ull(q[text() = c]) = σV =c (f ull(q))

 L ∈ P  q2 ∈ 8N)*, 6 %2 >q,.1 K $-,'+ Y1 6  2

{/,//,∗,[ ]}

 $W-#=,' 2[+, 6 pS+- $-$' $-,. ;,. >Y9 , 6 b& 6  # 



def

f ull(q1[q2 ]) = f ull(q1) B@,' Kret(q $-,. + 1$-),%;//q ,. >b2LM ?T , 6 W I @,  !Lv- ?,7 @L  & - $+j $,. , 6 [mJ$-,  @L P 1-0LM ?T  Y ? Y7N& $'  N# # q-2#=,' 2 e+2.ID , 6 pS+- $-$' $V$-,.J,ZYW , 6 K& 6  # i$J,. +xqP 2 Lq, 6 VmJ$-,[$-,'+f $Z,. 0K $&  , $ 6 ?# k\S ≺ -+#  &  BYW , 6  H, 6 W,''$'# @,' xP ≺≺ Ox., \SG72?=\# G\S ?b,' 7, 6 q%$'# ,X @LSI J4   \jG9-#=,' 2G+, 6 'p+%$$' ? q$x q $J,%J,' >$x YW , 6 b& 6  # K%2 >,. K$-,'+xP 516   

def

f ull($x q) = f ull(q$x) BCe1 .ID ≺ e2 .ID (f ull(q))

Y 6 -  $W, 6   & - $+j  >,.    $W, 6   & - $+j  >,. ret(q$x ) e2 .ID , 6 9,. +P D P++# 4 >!, 6 Z\j 2IpZ$x= #  $ Y7a/∗  ?\Ip,.$y  = $x b p$z = $y e



p$t = $z f _ 

f ull(p$x) = ea BCe .ID≺e.ID e f ull(p$y ) = eb f ull(p$z ) = f ull(p$y ) BCe .ID≺≺ e .ID ed a

$y



1 Y

d

f ull(p$t ) = f ull(p$z ) BCe$z .ID≺≺ ef .ID ef

& $'  7, 6 +, 6 pS+- $-$' $ p1 = $x//c p2 = $y//e

 # $' p,.&%,'BLM ?T , 6 Z;4K  ] >PxD Pyk 6  2

p4 = $t//h

.ID≺ ≺ e .ID ec

p3 = $t[g/text() = 5] 

f ull(p1) = f ull(p$x) BCe$x f ull(p2) = f ull(p$y ) BCe$y .ID≺≺ ee .ID ee c f ull(p3) = f ull(p$t) BB,' N, 6 .,.-K I%$W   H, 6 N\S 2 , 6 9,.'$#=,. $W @L e$y e$z  e$t P 516  e pS+%$$ $$ # 4< ?\I,. Z#  Element constructors





def



alg(hti{q}h ti) = xml(n(alg(q)),htiA1 h ti)

for $x1 in p1 , $x2 in $x1 /p2 ,. . . , $xk in $x1 /pk xq1 where $x1 /pk+1 θ $x1 /pk+2 and . . . $x1 /pm−1 θ $x1 /pm return $x1 /pm+1 , $x1 /pm+2 , . . . , $x1 /pn def

f ull(xq1 ) = σAk+1 θ Ak+2 ,...,Am−1 θ Am (f ull(p1 )BCID1 ≺ID2 f ull(//p2 ) . . . BCID1 ≺IDk f ull(//pk ) n BCn Cn ID1 ≺IDk+1 f ull(//pk+1 )B ID1 ≺IDk+2 . . . BCID1 ≺IDm f ull(//pm ) n n BCID1 ≺IDm+1 f ull(//pm+1 )BCID1 ≺IDm+2 . . . BCn ID1 ≺IDn f ull(//pn ) ) def

alg(xq1 ) = πAm+1 ,Am+2 ,...,An (f ull(xq1 )) Fig. 3. Generic XQuery query with simple return expression.

Y 6 -0, 6 d%$-,K +S,.  +&.$# #9,.+#  $BLM- T   h$ >?# ",'+# "YW , 6 a$' >?#  & # #  &%,'  ,-,. \I,.0?T! n P 516 "$&   J>?THalg(q)  ?,K @L, 6  ?+S ',.  > ,'T!+#=,.   &,. >, 6 ,2?# A%1$G @L , 6 9@,-,.- \I,.W TH  ,. N\S9+&. H   # TH ?,'$P 6  21xml A1 t J, 6 TH 

P f ull(hti{q}h ti) = n(f ull(q)) 576 7,.$'#=,. ?B# %$XLM ?X$& 6 ;4BpS+- $-$' $1 For-where-return I,'#      ] >SP*sf expressions ] >SP]tP  b$'=T!+#  &  ,;4 , 6 %$'K#  $H$'"$ >?#  $-4TV\S #qLM V$' ?TH θ -\ ,. ;4 +j @,' ,.=# # 4K j- , & ?TH+ $ < +j ',. J$P  ?  &[I J4 1 ^   >Ps _ +, 6 pS+%$$  p1  $ (1) \$ ?# Simple I,. Y 6  # !return  # #* @, 6  -$[clauses N #=,. 2V"$J,%J,WLM ?T !Ixq  J402?=\#  P7,,' \I,' & - $+j $,. P 576 9I J4!-%,'$$ T!q2  -=\# %$P?7,,' \I,' $x1 $Q, 6 9,,' \IID ,'9 1 & ret(p %$'+S 1) >N,' 1, 6 ,. + I7 L Lv ]%2 J4b+@, 6 'p+%$ID $' ? i   P fG,ull(//p pi.,.-h\ 4", 6 H# >? \' &,.p i$#=p,.2 ,. $ . . @,Lqp, m ,' \I,' $ i)      ,  ' $ V   6 6

-# @,' 21+, A pS+%$A$ k+2 $G .Lj., 6. ,qA Y 6 m 1& #=$' TH q+ &  $' # 4 , 6 9,,' \I,' $q  6 k+1

P X& 6  $   +j  >H  P 1alg(//p @,.1, 6 @,7k+1 & )

alg(//p  $*k+2  ! )[LM. . ?. ,Q, @alg(//p L$'& 6 9m )#=,. 2Iq+A, 6 k+i pS+- V$-$'  C  $XV\$' p# k+i I,'7pS+- $-$'  , 6 $ // //p k+i ,''$'# @,.\# q,' G, 6 q # > \'P 516 & 6  # N 2 >,. ?$J,. +N& ? &.,. > ['p+%$$' ?  $W&+I,' "\ 4 $ZLM 7Y 4)6  & 6 [- $# ,'$1 @L 6 @$WT!+I,;4B- $# ,@P 57xq6 [4 I,' 3-   $1 $J,.  \j & $b# #j, 6 $x f>? wr($x) f, wr($x) ,. BLv 1N> 2I  TV$J,7\jZ & #  N$ >?#   # T! ,@P   6  Z

$x hai alg(xq ) @> 0,;Y7 !&@$'%$P  L   $ @,Z& $-,.-&%,Z.Y  # TH ?,'$ "$' &  I%$ 4, 6 ,%@> > > ,'T!+#=,. $9$=TH+#  f wr # T! ,@P  L # $' !& ?$J,.-&%,$9$' ?TH # TH ?xq ,'$ 4 , 6  +j ',.  hai f wr  \ # $9[\ > >? Q,%@> > >b,.TH+#=,. \?4H &# $ >  B xml  # TH ?,P alg(xq 2BLM- Tgwr) , 6 [\S 2hai # %$P 576 9,.'4)$#=,.  @L*TH 9& ?TH+# p - %$9& K\S- templ(f Q OF.,$W,.$'#=,.N, 6 bI J4     >P D ^ # $ B & # # , 6 b+, 6 pS+- $-$' $b@,W, 6  Example.  K @L  &.,. ?Hq, 26  

for $z in p$z where p3 return hres3i{ p4 }h/res3i ,''$'# @,' # 9LM ?T ]  >Pt Y7Z \I,.   xq4 

q2

PjOx%,9$1mJ$-,G,.'$#=,.

f ull(q2 ) = f ull(p$z ) BCne$z .ID ≺ e3 .ID f ull(//p3) BC@ne$z .ID=e$z .ID f ull(//p4)

++# 4 >!, 6 

xq4

# Z@>  ,;YW &  LM ?

q

# $7,'



Fig. 5. Sample query performance measure.



 #   >

(∗) f ull(q) = f ull(p$x) × f ull(p$y ) BC@ne$x .ID=e$x .ID f ull(p1) BC@ne$y .ID=e$y .ID f ull(p2)BC@ne$y .ID=e$y .ID f ull(q2) q 

$G,''$'# @,'  7Y  6 %2I

alg(q) = xmltempl (f ull(q))

Y 6 -

$

xmltempl 

hres1ie1 .Chres2ie2 .Ch res3ie3 .Ch/res3ih/res2ih/res1i

Y 6 - W, 6  ,,.- \I,.%$W& %$'+S  >B,' , 6 [+, 6 pS+- $-$' $ , p 2  e1^ , .C,  e3&.C  >0-%,' C "  $ P RZ\$;2IN, 6 ,Z B,%!%$-,.-&%,' >

,' N, 6  I 2?#  ?,7LM ?T

f ull(q)



σ(e$z .ID6=⊥)∨(e$z .ID=⊥∧e2 =⊥) ( f ull(p$x) BC@ne$x .ID≺≺ ec .ID ec × f ull(p$y ) BC@ne$y .ID≺≺ ee .ID ee BC@ne$y .ID≺≺ ed .ID (ed BC@ned .ID≺≺ ef .ID (ef BCnef .ID≺≺ eg .ID σV =5 (eg )BC@ned .ID≺≺ eh .ID eh )))

Y9 , 6 , 6 b @, 6 9$\jpS+- $-$' $b$J,.&.,.# # 4 576  $Z.YW- ,. > 6 $>? +S0,. >?%, 6  ^ , 6 ]3J ? H+- I&.,G\j.LM ?-7f, 6 ull(p -# @,'b,'  $x P  , ) 6 $G# $ Z>?- ?+S  V, 6 7$'\SpU ^ , 6 b#=× +- $-$' $Z$-$x ,'&.,.'# # 4d-# @,'K,. @$-,W_ ,;Y7 B#   $ P  ,9,.-$[ fIull(p ,W, 6 @$y ,W, )6  $b& ?-%$'+S  pj&.,.# 4H,. [, 6 Z # > \' &Z$'TB?,' &%$y $7 @L]+,,'-$  _     >?D P 576 $ V10

V11



alg(q) = xmltempl (σ(e$z .ID6=⊥)∨(e$z .ID=⊥∧e2 =⊥) (V10 × V11 ))

576 

σ

 $1b\?4IUc+ I&., @LF,.$-LM 'TH >b, 6 #  ,;43J  $9 

5

Evaluation

(∗)

 B$J,.&.,.#I3J ? $P

576  $9$&.,. +%$' ?,$\ .LQ%2?# ,' " @L], 6 Z+j JLM 'TH & Z LQ 1;4 - , 6 T 9, 6 *Crq8N0bA C CE Pattern - %$P 516 extraction [ >!,.=T!time 1Y1$W # Y9%4$W# %$$G, 6 hD w rHT!$ ^ Y1[[ Y1Z LQK ++j ;,. ,;4,.  TH+ 2IG, 6  $XLMJ, 6 q\ 4B # =T! ,. >N$' ?THG$'+- $1 \I3J &%,7$'   #  @,' B$-,'+$ P 516  $G& ?Im'T!$ , 6 ,9+,,'-SP y W& ?TH+1, 6 +j JLM 'TH &  @LF, 6 9LM ?# # YW >d b++ ?& 6 P/GT!T\S , 6 ,Q, 6  $.YW- ,. >[ $Q ?# 4!\#  H\ 4V Q#=J>7+,,. -& >? ,. B # > - , 6 T 516 G%2?# ,'  ,'=TH9LM G2?- $9$' %$9 @L*8N0B &T! ,'$-[ + &.,. 0  ] >P w $' >Kb$' >?# 2 .Y # $ ,' BV >B,.=T!mI2I,.=T! $7$ 6 J,'G, 6 " LQ, 6 -2 .Y9$Z-N$' xP 516  $$=TH+# [I J40 $Q3-$J, pj TH+#   & \'   ,'$N I2   ,%@>?B $, 6 ,b ,[ & >?  $#  J>?Z+,,. $Z, 6 h+.2 ?$[Y1 -$ , 6 $[ ,[>? 2 $bTH  ++S J,. ,. %$9LM ?7.YW- ,. >PSR9# >?  , 6 T 6 @$1 >?#  >? \# b 2 6  B, 6 ZI J4?$1 ,1\# %$9&K\ >!=T!+j J,.?,7+j JLM 'TH&\S.m,'$P

References 1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. 2. S. Al-Khalifa, H.V. Jagadish, J.M. Patel, Y. Wu, N. Koudas, and D. Srivastava. Structural joins: A primitive for efficient XML query pattern matching. In ICDE, 2002. 3. S. Amer-Yahia and Y. Kotidis. Web-services architectures for efficient XML data exchange. In ICDE, 2004. 4. A. Arion, V. Benzaken, and I. Manolescu. XML Access Modules: Towards Physical Data Independence in XML Databases. XIME-P Workshop, 2005. 5. A. Arion, V. Benzaken, I. Manolescu, and R. Vijay. ULoad: Choosing the Right Store for your XML Application (demo). In VLDB, 2005. 6. K. Beyer, F. Ozcan, S. Saiprasad, and B. Van der Linden. DB2/XML: designing for evolution. In SIGMOD, 2005. 7. M. Brantner, S. Helmer, C.-C. Kanne, and G.Moerkotte. Full-Fledged Algebraic XPath Processing in Natix. In ICDE, 2005. 8. N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: Optimal XML pattern matching. In SIGMOD, 2002. 9. Z. Chen, H.V. Jagadish, L. Lakshmanan, and S. Paparizos. From tree patterns to generalized tree patterns: On efficient evaluation of XQuery. In VLDB, 2003. 10. B. Cooper, N. Sample, M. Franklin, G. Hjaltason, and M. Shadmon. A fast index for semistructured data. In VLDB, 2001. 11. A. Deutsch and V. Tannen. MARS: A system for publishing XML from mixed and redundant storage. In VLDB, 2003. 12. H. V. Jagadish, S. Al-Khalifa, A. Chapman, L. Lakshmanan, A. Nierman, S. Paparizos, J. Patel, D. Srivastava, N. Wiwatwattana, Y. Wu, and C. Yu. Timber: A native XML database. VLDB J., 11(4), 2002. 13. H. Jiang, H. Lu, W. Wang, and J. Xu. XParent: An efficient RDBMS-based XML database system. In ICDE, 2002. 14. R. Kaushik, P. Bohannon, J. Naughton, and H. Korth. Covering indexes for branching path queries. In SIGMOD, 2002. 15. I. Manolescu and Y. Papakonstantinou. An unified tuple-based algebra for XQuery. Available at www-rocq.inria.fr/˜manolesc/PAPERS/algebra.pdf. 16. G. Miklau and D. Suciu. Containment and equivalence for an XPath fragment. In PODS, 2002. 17. F. Neven and T. Schwentick. XPath containment in the presence of disjunction, DTDs, and variables. In ICDT, 2003. 18. P. O’Neil, E. O’Neil, S. Pal, I. Cseri, G. Schaller, and N. Westbury. ORDPATHs: Insert-friendly XML node labels. In SIGMOD, 2004. 19. S. Paparizos, Y. Wu, L. Lakshmanan, and H. Jagadish. Tree logical classes for efficient evaluation of XQuery. In SIGMOD, 2004. 20. C. R´e, J. Sim´eon, and M. Fernandez. A complete and efficient algebraic compiler for XQuery. In ICDE, 2006. 21. XQuery 1.0. www.w3.org/TR/xquery. 22. The XMark benchmark. www.xml-benchmark.org, 2002. 23. W. Xu and M. Ozsoyoglu. Rewriting XPath queries using materialized views. In VLDB, 2005.

Suggest Documents