on the containment problem for queries with safe ...

2 downloads 0 Views 487KB Size Report
We consider the problem of query containment for conjunctive queries with safe negation ... For this class an algorithm to test the containment problem is given.
ON THE CONTAINMENT PROBLEM FOR QUERIES WITH SAFE NEGATION Victor Felea

“Al. I. Cuza” University of Iasi, 16 G. Berthelot Street, Iasi, Romania E-mail:[email protected] Abstract. We consider the problem of query containment for conjunctive queries with safe negation property. A special class of queries is specified. For this class an algorithm to test the containment problem is given. The time complexity of this algorithm is presented.

1.Introduction The problem of query containment is very important in many management applications, including query optimization, checking of integrity constraints, analysis of data source in the data integration, verification of knowledge bases, finding queries independent of updates, rewriting queries using views. The problem of query containment has already captivated many researchers: Cohen [4, 5], Farre et al. [8], Florescu et al. [11], Halevy [12], Huyn [13], Lausen and Wei [14], Leclere and Mugnier [15], Chen [16], Ullman [21], Wei and Lausen [22]. In [21] J. D. Ullman presents an algorithm based on canonical databases, using an exponential number of such databases. In [22] F. Wei and G. Lausen propose an algorithm that uses containment mappings defined for two queries. This algorithm increases the number of positive atoms from the first query in containment problem. Many authors study the problem of query containment under constraints. Thus, in [8] C. Farre et al. specify a constructive query containment method to check query containment with constraints. N.Huyn et al. consider the problem of incrementally checking global integrity constraints [13]. Some authors approach the containment problem for applications in Web services: A. Deutsch et al. in [6]. The containment problem of conjunctive queries using graph homomorphisms giving necessary or sufficient conditions for query containment is investigated by M. Leclere and M. L. Mugnier in [15]. The reduction of the containment problem to equivalence for queries with expandable aggregate functions is realized by S. Cohen et al. in [5]. The containment query problem is used for rewriting queries using views by F. Afrati et al. in [1] and [2]. In a recent paper the author introduces and studies the notion of strong containment that implies classical containment problem for two queries in conjunctive form with negation [9]. Checking containment of conjunctive queries without negation (called positive) is an NP-complete problem (A. K. Chandra [3]). It can be solved by testing the existence of a containment mapping corresponding to the two queries. For queries with negation, query containment problem becomes -complete.

In this paper we introduce and study a special class of queries. The time complexity for the containment problem of queries from this class is discussed. The paper is organized as follows: in Section 2 we define some definitions and notations used in the following. In Section 3 we define a special class of queries and point out some properties of this class with respect to the containment problem. In section 4 some aspects of time complexity for the containment problem associated to the class specified in Section 3 are presented. Finally, a conclusion is presented. 2.Preliminaries Consider the two queries Q1 and Q2 having the following forms: nd = =

, where ,…, ,…,

(1)

The vector consists of all free variable from the queries Q1 and Q2, are vectors that consist of all existentially quantified variables from Q1 and Q2, respectively. The symbols Ri and Sj are relational symbols, are vectors of variables from or from and are variable vectors with elements from or from . We make two assumptions on the variables from the queries Q1 and Q2. The first one is: each variable that occurs in the head also occurs in the body. This property is called safeness of queries. The second one is: each variable that occurs in the negated sub-goals of query it also occurs in the positive part. In the following definition we give the answer of a query for a database. Definition 1. Let be a query Q1 as in (1), Dom a value domain and D a database defined on Dom. We denote by Q1(D) the set of all h(τ ), where τ is a substitution for all variables from such that there is an extension of τ to all variables from , denoted τ1, such that D satisfies f1( , ) for τ1. Formally, Q1(D)={h(τ )/ τ1 an extension of τ so that D |= τ1f1( , )} (2) The notation D1 |= τ1f1( , ) means: τ1 Rj( ) D, for each j, 1 ≤ j ≤ h and τ1Rh+i( ) D, and for each i,1 ≤ i ≤ p.

Definition 2. We say that the query Q1 is contained in Q2, denoted Q1 Q2, if for each value domain Dom and for each database D on Dom, we have Q1(D) Q2(D). Definition 3. A query Q1 as in (1) is satisfiable if there exists a value domain Dom and a database D on Dom, such that Q1(D) ≠ , otherwise it is unsatisfiabile. Proposition 1[22] A query Q1 (or f1) having the form as in (1) is unsatisfiable if there exists two atoms, one from the positive part of f1, namely Rj( and the second one from the negated part of f1, Rh+i( such that these atoms are identical, that means: Rj=Rh+i and = . □ Let us denote by f1( , )= the case when f1 is unsatisfiable. Since in this case we have Q1(D) = , it is sufficient to consider only the case when f1( , )≠ . In the following we need to consider sets of equality relations defined on the set C={x1,…, xq, c1,…,cm}, where xj,1≤ j ≤ q are all variables from and ci, 1 ≤ i ≤ m are all variables from . Let us denote by M a set of equality relations on C. We express the set M as: M = {(t1, t1’),…,(ts, ts’)}, where ti, ti’ C. Let us denote by M* the reflexive, symmetric and transitive closure of M. It results that M* defines a set of equivalence classes on C. We denote by i the class that contains the element ci. We need to specify a total order on C, let us consider this order as x1 < …

Suggest Documents