Preference Propagation in Database Applications - Semantic Scholar

4 downloads 11887 Views 202KB Size Report
Abstract—User preferences are a fundamental ingredient in the deployment of personalized database applications, in par- ticular those in which context plays a ...
Preference Propagation in Database Applications (extended version including all the proofs) Paolo Ciaccia # and Riccardo Torlone ∗ #

Dip. di Elettronica, Informatica e Sistemistica Universit`a di Bologna, Italy [email protected]

Dip. di Informatica e Automazione Universit`a Roma Tre, Italy [email protected]

Abstract— User preferences are a fundamental ingredient in the deployment of personalized database applications, in particular those in which context plays a key role. Given a set of preferences defined in different contexts, in this paper we address the problem of inferring which are the preferences that should be used for answering queries in a given context. For the sake of generality, we work with an abstract context model and two uninterpreted algebraic operators for combining preferences. In this framework, we study how preferences should be combined according to a set of basic propagation principles, among which the one stating that if a conflict arise, then the more specific context prevails. We investigate the general properties of the operators, define canonical ways to build expressions respecting the propagation principles, and identify syntactical conditions that guarantee the equivalence of all the expressions that are wellformed: these results hold for any interpretation of the operators. Then we consider a specific interpretation, which corresponds to the well-known Pareto and Prioritized composition rules. We study three alternative semantics for this scenario and provide precise containment relationships.

I. I NTRODUCTION The information available in digital form is growing so fast that today the availability of methods for automatically filtering the accessible data according to the real needs of the users has become a compelling need. In this framework, context awareness [7] and user preferences [11] have emerged as viable solutions to this problem. The former term refers to the ability of selecting data according to some features of the environment in which the system is used, such as the location, the time, and the device. The latter refers to the ability of evaluating the relevance of data for a given user on the basis of a set of preferences settled on such data. In this paper we consider both aspects together and study the problem of selecting the most relevant data in a situation in which preferences are defined in different contexts and queries are posed in one of them. The scenario we refer to is illustrated in the following example. Example 1: Assume that we have fixed some contextual preferences for food such as “In Italy, a dish of pasta is preferable to one of beef, but if you are in Naples you should taste the world-famous pizza instead of pasta. In summer, however, a fresh salad is preferable to a hot dish of pasta”. Assume now that it is summer, we are in Naples and we

would like to have some suggestion for food. Actually, all of the preferences above should be taken into account since they refer to contexts that are more general than the current one. However, it is evident that the preferences defined in “Naples” and “Italy in summer” should take precedence on those in the more generic context “Italy”. Moreover, a preference in “Naples” should not take precedence on a preference in “Italy in summer”, and vice versa, since, in general, one does not apply to the other. It turns out that, in the current context, pizza and salad are both the best alternatives among the mentioned foods and should be returned first by an information filtering system since, on the basis of the combination of preferences discussed above, no other food is preferable to them.  As shown in this example, a generalization hierarchy can be usually defined over contexts, and preferences defined in different contexts propagate along this hierarchy. Thus the problem can be rephrased as the investigation of preference propagation in a hierarchy of contexts and its impact on database querying. Recently, this issue has been studied extensively [1], [19], [22], [24], [25], [26], [28] but always resorting to a rather pragmatic, operational approach. Conversely, in this paper we tackle the problem in a principled way with the goal of providing a solid basis to the issue of context-aware preferences for database querying and identify some results of general validity. With this goal in mind, we consider a very general notion of context that only requires that the contexts form a poset, that is, a set with a partial order relationship on its elements. This allows us to investigate the problem independently of the specific model used. Indeed, we introduce a formalism (the CT model) for the representation of contexts to provide concrete examples and to show that certain finiteness conditions are usually satisfied by actual context models. However, the various results refer to the general notion of context poset. As a first step towards the foundation of a theory of contextual preferences, we fix three basic principles on the propagation of preferences, which are also implicitly at the basis of earlier approaches to the problem [22], [24], [26] and correspond to the simple observations made in Example 1: the soundness principle states that the preferences over a context

c depend only on the preferences over every context that is more generic than c; the fairness principle states that the preferences over a pair of incomparable contexts c1 and c2 do not take precedence on each other; finally, the specificity principle states that the preferences over a context c1 take precedence on the preferences over a context c2 when c1 is more specific than c2 . We then consider a very general scenario in which the combination of preferences in a given context is expressed by means of an algebraic formula, which we call PC (Preference Composition) expression, involving two binary operators: (i) ⊕, which combines preferences defined in two unordered contexts without giving precedence to any of them in accordance to the fairness principle, and (ii) ⊛, which combines preferences in two ordered contexts by giving precedence to the more specific one in accordance to the fairness principle.

counter-intuitive behaviors of its application, which leads us to introduce two alternative semantics. In order to show the generality and flexibility of our approach, we also investigate further, alternative interpretations of the ⊕ and ⊛ operators. In particular, we consider an interpretation based on the association of a score to each tuple according to its relevance in a given context. We finally discuss a number of implementation issues showing that, given its algebraic nature, our framework can be implemented in a very natural way. In particular, we detail the basic procedures that are needed for propagating preferences and computing the best alternatives in a certain context, and characterize their complexity. In sum, the contributions of this paper are the following: •

Example 2: An example of PC-expression computing the combination of preferences discussed in Example 1 for the context “Naples in summer” is the following:



≻Naples,summer ⊛((≻Naples ⊕ ≻Italy,summer )⊛ ≻Italy ) where ≻c denotes a set of preferences defined in the context c and, e.g., ≻Italy = {pasta ≻ beef}. In this expression the preferences in “Naples” and those in “Italy in summer” are combined with the ⊕ operator, since the corresponding contexts are unordered. The result is combined with the preferences in “Italy” using the ⊛ operator, since this context is more generic than both “Naples” and “Italy in summer”. Finally, the result is combined with the preferences defined in “Naples in summer” using the ⊛ operator, since this is the most specific context. 





We characterize a PC-expression in terms of the basic principles by introducing the syntactical property of SFS (Soundness, Fairness, and Specificity) for PC-expressions and show two general results, which are independent of the specific semantics of ⊕ and ⊛: (1) SFS expressions can always be expressed in a canonical form, provided that the context poset satisfies some finiteness conditions, which are fulfilled in practical cases, and (2) under the same hypothesis, any SFS expression is equivalent to the canonical form, provided that certain algebraic properties are satisfied by the composition operators (namely, ⊕ and ⊛ must form an idempotent semiring). We also show that, under some restriction on the context poset, another, more compact, normal form can be defined for PC-expressions. We then consider a specific semantics for the ⊛ and ⊕ operators, thus providing a concrete way to evaluate PCexpressions and propagating preferences in a context poset. Specifically, we focus on two well-known methods for combining preferences: Prioritized and Pareto composition [11], [16]. Since these operators satisfy the required algebraic properties only to a limited extent (specifically, they form a nearring), under this interpretation not all the SFS PC-expressions are equivalent. We then identify a “reference” (canonical) form of Prioritized-Pareto PC-expression and investigate some





the identification of the basic principles underlying the combination of contextual preferences: these principles refer to a very general notion of context model and this guarantees the generality of the approach; the definition of an algebraic representation of preference propagation in terms of PC-expressions and a syntactic characterization of satisfaction of the general principles: the characterization is independent of the semantics of the operators of the algebra; the definition of a canonical form for PC-expressions and the identification of the algebraic properties that guarantee the equivalence of PC-expressions: this result does not depend on the specific interpretation of the operators of the algebra; the investigation of a natural interpretation of the algebraic operators that relies on Prioritized and Pareto composition and the definition of different semantics aimed at overcoming certain conceptual limitations: this provides a concrete implementation of contextual preference combination; the description of two other meaningful interpretations of the algebraic operators: this shows the generality and flexibility of our approach; the analysis of the fundamental issues related to the implementation of the framework we have proposed: we detail the main operations that need to be performed for evaluating preference queries and the complexity of their execution.

The rest of the paper is organized as follows. In Section II we recall some notions on posets and preference relations. Section III introduces the general notion of context to which our results apply and provides as an example a specific context model. In Section IV we investigate the operators for combining preferences and, in Section V, we consider a specific interpretation of them. In order to demonstrate the flexibility of our approach, Section VI briefly details two alternative interpretations of the composition operators. In Section VII we discuss implementation issues and, in Section VIII, we compare our work with the literature. Finally, in Section IX, we draw some conclusions and sketch future works on this topic. 2

Qk tuple domain of R, that is, over the set i=1 Di , where Di is the set of values associated with the attribute Ai of R. Definition 1: A preference relation ≻ over a scheme R(X) is a strict partial order on the tuple domain of R. If t1 ≻ t2 then t1 is preferable to t2 . If neither t1 ≻ t2 nor t2 ≻ t1 hold, then t1 and t2 are indifferent, denoted t1 ∼ t2 .

II. P RELIMINARIES A. Partial orders For what follows some basic notions on partial orders and posets are needed. A (weak) partial order ≤ on a domain V is a subset of V × V whose elements are denoted by v1 ≤ v2 that is: reflexive (v ≤ v for all v ∈ V ), antisymmetric (if v1 ≤ v2 and v2 ≤ v1 then v1 = v2 ), and transitive (if v1 ≤ v2 and v2 ≤ v3 then v1 ≤ v3 ) [9]. A strict partial order, denoted by