Document not found! Please try again

Modelling Data Warehouses and OLAP Applications by ... - CiteSeerX

7 downloads 67 Views 272KB Size Report
The idea of a data warehouse is to provide a separate database to store these data. .... Then the facts Quantity, Money sales and Pro t appear as attributes of the.
Modelling Data Warehouses and OLAP Applications by Means of Dialogue Objects Jana Lewerenz1 3, Klaus-Dieter Schewe2 , Bernhard Thalheim1 ;

1

Brandenburg Technical University of Cottbus, Department of Computer Science 2 Technical University of Clausthal, Computer Science Institute 3 Supported by the German Research Society (DFG grant no. GRK 316)

Abstract. The idea of data warehouses is to provide condensed information in order to support managers in the analysis of business facts such as sales, costs, pro ts, etc. along various dimensions such as geography, organisation, time, etc. The analysis should allow fast switches between di erent selected multiple dimensions at di erent granularity. The task itself is usually called on-line analytical processing (OLAP). We show in this paper how to model data warehouses with OLAP functionality by means of dialogue objects . These are extended and possibly materialised views on collections of operative databases and couple structural and behavioural aspects of application units.

1 Introduction Most information systems in practice have been designed and developed for the purpose of supporting operative tasks. Users invoke actions on selected data which lead to transactions on underlying databases. Thus, from a database point of view a central issue of such systems is on-line transaction processing (OLTP) although this term does not capture quite correctly the user's perspective [3]. Technically, OLTP is re ected in database design by putting emphasis on desirable properties such as non-redundancy and anomaly avoidance as realised by various normal forms. For the design of management support systems that often involve on-line analytical processing (OLAP), these guidelines do not apply at all since the dominant aspect of managerial work is to ask for a condensed overview on business progress. Most access is read-only, involves large amounts of data and must be repeatable. User needs play a decisive role for information system design [14]. Nevertheless, OLAP applications can be considered as information systems requiring specialised functionality. This functionality can be provided by the combination of existing technologies.

1.1 Stars and Snow akes More precisely, a manager may ask for a fast overview on mostly numerical business facts such as sales, costs, pro ts, etc. over some period of time, e.g., a

week, a month or a year. These facts should be organised along di erent multiple dimensions such as geography (country, state, city, etc.), organisation structure (head quarters, agency, department, etc.), product structure (individual, group, kind, etc.) and many others. The idea of a data warehouse is to provide a separate database to store these data. Thus, its structure can be described by a very simple entity-relationship schema with entity types for each dimension and a single -ary relationship type for the facts. Such a schema is usually called a star schema [4, 7, 10]. Besides various other arguments, the use of star schemata underlines the conceptual need for multi-arity relations [17]. Due to the dominance of read-only access there is no need to consider normalisation. At a more detailed level a dimension can be fanned out into a hierarchy. For example, a product dimension may include category, brand, package size, etc. In terms of the higher-order entity-relationship model (HERM) [17] only the largest of these `new' dimensions will lead to an entity type; all others will be turned into (possibly unary) relationship types. Accordingly, the fact relationship type will now relate relationship types, i.e., it is a higher-order type. The resulting schema is usually called a snow ake schema [4, 10]. Besides other arguments, snow akes underline the advantages of higher-order relationship types [17]. Note that some authors, e.g., Kimball [7], negate the necessity to model snow akes. The data in instances of a warehouse schema|be it a star or a snow ake| stem from di erent operative databases. We may regard the collection of involved databases as a non-integrated virtual multi-database containing homonyms, synonyms and redundant data possibly organised in several di erent ways. There is, however, a transformation which maps an instance of this multi-database to an instance of our warehouse schema so that the warehouse turns out to be a view (or superview) [9, 11] in the most general sense. Even if we assume a copy semantics, i.e., data in operative databases will be time-stamped and added to the existing data in the view, this will not cause a signi cant change. The basic assumption of data warehouses being organised as separate databases then turns into the call for view materialisation. Whether this demand is generally justi ed or not is a matter of debate. It de nitely depends on the time complexity of building the view and the frequency of access. Furthermore, note that the intended OLAP applications will almost always access small portions of the warehouse data that are themselves aggregated. This leads to views over a view which by compositionality are again views. n

1.2 OLAP Functionality The warehouse only provides the data to be processed. The major intention is, however, the support of managerial work by the means of OLAP functionality [6, 12, 19]. At the heart, this means to provide functions

{ to present aggregated numerical data in various ways, { to quickly switch between di erent presentations or to keep several presentations at a time,

{ { {

to quickly switch between di erent sections of data by means of changes in granularity (drill down or roll up), selected dimensions, selected facts, etc., to process hypothetical changes, or to provide analytical models for the evaluation of data.

Of course, this list is not complete. It depends on the requirements of the management support system. It is often claimed that database technology does not provide the functionality required by OLAP applications [7]. We will demonstrate below that the entity-relationship model provides the complete functionality and support necessary for OLAP applications. We observe that OLAP schemata can be regarded as external ER views with/without materialisation. OLAP schemata can be de ned on the basis of QBE or SQL extensions for ER models such as HERM [18]. The presentation of data can be varied according to presentation rules of dialogue objects. In fact, the di erence between managerial and high-skilled clerical work is not as large as it appears in the literature [7, 10]. Both applications share the use of underlying external views as emphasised above as well as the user-driven invocation of actions on the view data. In both cases it is, therefore, desirable to couple views with operations on them. This is the core idea of dialogue objects, a notion coined in the context of cooperative information system development [13] that has been thoroughly formalised [15]. Dialogue objects provide exibly handable units for on-line processing. In both clerical and managerial work fundamental decisions on selecting work steps are left to the user. The major di erence is that in the rst case actions usually lead to queries or transactions on some operative database, whereas in the latter case they almost always lead to queries on the warehouse view. A challenge in OLAP is that even parameterisation of these queries should be supported. This may also have an impact on operative dialogue oriented systems. In the remainder of this paper we shall present some more details to support our argumentation. Methodologically, this implies a change of focus in the design of management support systems providing OLAP functionality. The central question concerns the dialogue units required in managerial work. Structurally, this involves the de nition of external views. Behaviourally, it provides the functions sketched above.

2 Modelling the Warehouse We start with a more detailed discussion of structural aspects but keep the presentation rather informal. Throughout this section we use a simpli ed version of the grocery store example from Kimball [7] to illustrate our ideas. We shall rst see how to model the application star schema using the entity-relationship model. We demonstrate that ER is well-suited for warehouse modelling contradicting statements made in some warehouse publications [7]. Then we show how to obtain the star schema as a view in the case of a single operative database.

The more realistic case of several operative databases appears as a simple generalisation. Finally, we handle the case of a snow ake schema using dimension hierarchies. Ecient algorithms for the translation of HERM views into bundles of relational views exist [18] and can then be used for a realisation of those views.

2.1 The Grocery Store Example An OLAP application. Suppose we want to analyse the development of sales in a grocery chain. We might, e.g., want to know how many items of certain products have been sold in certain stores or regions over some period of time, what the corresponding sale was and how much pro t we made. So the facts we are interested in are Quantity, Money sales and Pro t. The dimensions are Time, Product, Customer and Shop. Then the facts Quantity, Money sales and Pro t appear as attributes of the relationship type Purchase. In addition, we may have attributes PID, Description, Category for the entity type Product, attributes SID, Town, Region, State, etc. for the entity type Shop, attributes TID, Weekday, Month, Quarter, etc. for the entity type Time and attributes Name, Address, Category, etc. for the entity type Customer. Note that a relational transformation usually will turn key attributes of the dimension types (xID) into foreign keys for a fact table. Figure 1 shows the corresponding (simpli ed) ER schema in HERM notation [18]. TID

Time 6

Address(Addr1,Addr2,Addr3,Town,State,Zip) Money sales Pro t HH  Name(CID) PurchaseH H  Customer   H  HH Name(Category)   H Quantity Name(First,Surname)

... Name Town State -

Shop

Region Phone SID

?

Description

Product PID Category

Fig. 1. Grocery Example: Star Schema on Purchases The operational database. The data to be stored in instances of the warehouse schema mostly stems from operative databases. So assume that each purchase in

a store will be registered together with the date and the number of items sold. In addition we keep a price relation also depending on the date. For the moment we leave aside any further decomposition for products and stores. The date can be assumed to be composed of day, month and year. This is usually sucient for operative purposes. In order to relate a date to the more complex data needed in the entity type Time in the star schema, we may assume a general pre-determined time table containing all the time attributes needed there. Of course, (day, month, year) will appear as a composed key. Figure 2 shows an ER schema for such a simple sales database (omitting some attributes). Note that this schema is not yet normalised, but for the sake of simplicity we abstain from further decomposition. Description

Kind PID

Part

 XXX y XX

HH  Price H H   H  HH  H

Cost XXX Category CID XXX Name(First,Surname)

...

Date Price

Person



-

Store 6

XXX XXHH  Buys H H   H HH  H

Address(Addr1,Addr2,Addr3,Town,State,Zip)

Quantity Time

Fig. 2. Grocery Store: Operative Sales Schema

2.2 External Views The central claim made in the introduction is that the star schema occurs as an external view on the operative sales schema (enriched by the assumed time table). In general, a view is nothing but a stored query, hence consisting of an input schema S , an output schema S and a database transformation mapping instances of S to instances of S . Additionally, representation, formation and wrapping rules [2] can be added to the view depending on the dialogue object. In our example S is the (enriched) sales schema (Figure 2) and S is the star schema (Figure 1). Since ER schemata|even higher-order schemata|are basically hierarchical in the sense that relationship types are based on entity types, it is sucient to employ sequences of SELECT-FROM-WHERE statements to de ne the mapping . Di erent views can be de ned for the application sketched above. The view displayed in Figure 1 is a star schema based on a relationship type. This star schema can be obtained via the following view de nition: in

V

out

in

in

V

out

out

StarSchema as Product : select PID : p.PID , Category : p.Kind , Description : p.Description from Part p , Time : select Time from Buys Customer : select Name : c.Name, CID : c.CID, Address : c.Address, Category : c.Category from Person c where ... Category ... Shop : select ... from Store s Purchase : select Customer(CID) : ..., Product(PID) : ..., Time(TID) : ..., ..., Shop(SID) : ..., Money sales : ..., Quantity : ..., Pro t : ... from Buys(Part, Person, Store), Price pr where ...

create view

;

In the same fashion the snow ake schema displayed in Figure 3 (partially without attributes) can be generated on the schema discussed in Section 2.1. Selling Period

Promotion Period

6 InHH H   H HH

Customer Category

6 H  Of H H   H HH

- Customer 

Production 

HH POfP H   H  H H

6 -

Time

HH During H   H HH



6 H H  H  Purchase  H H H ? - Product 

-

? Shop

Region

6 H SInRegH H   H HH



HH  H   H H PInC  H

- Category

Fig. 3. Grocery Store: Snow ake Schema on Purchases

2.3 Dimension Hierarchies In many cases it is desirable to consider more complex warehouse schemata than the at star schema. This situation usually occurs if dimensions are organised

hierarchically. E.g., products can be grouped into Brand and shops into Region. Using the higher-order ER model, Brand and Region will now occur as entity types and the original types will be turned into unary relationship types. For the case of our grocery store example, this situation is illustrated by the snow ake schema in Figure 4, omitting all attributes.

Brand 6 HH  H Product H HH

Time 6 HHH  Purchase H HH

Region 6 HHH - HHShop H

Fig. 4. Grocery Store: Snow ake Schema with Dimension Hierarchies In a number of OLAP applications similar hierarchies for the presentation of time, products, people, organisations, addresses, etc. are used. We can de ne ER schemata for hierarchies. The corresponding databases may be instantiated automatically. The schemata can be used for the generation of hierarchic view sets. OLAP schemata are then views de ned on the basis of the application ER schema and on ER schemata for hierarchies. In this case, the snow ake schema displayed in Figure 3 is generated on the basis of the application discussed in Section 2.1 and on view types which are de ned on hierarchy schemata for time, addresses, customer categories and product categories. Time is modelled in OLAP applications on the basis of universal relation assumptions. The representation of time de ned for the higher-order ER model uses several relationship types. The OLAP representation in Figure 5 is based on the universal relation approach. The OLAP type has 25 attributes. Additionally, an identi er attribute may be introduced. A large number of integrity constraints must be considered. The time dimension is necessary in OLAP applications because it facilitates slicing of data by workdays, holidays, scal periods, seasons and by major events. The time dimension is guaranteed to be present in every warehouse application, because virtually every data warehouse depends on a time series. Time is usually the rst dimension for sorting. The representation of time (where some information can be derived but is still explicitly contained) shows that OLAP schemata are redundant. Denormalisation is common in OLAP schemata. Understanding OLAP schemata as views with redundant representation and denormalisation does not create any problem. Based on the roll-up and drill-down functions discussed below it is possible to display data in di erent granularity. This approach can be simulated by fam-

...

Time

WeekInQuarter Week DayInQuarter DayInYear Weekday MonthDay Day Holiday FinancialYear FinancialMonth FinancialWeekOfYear FinancialWeekOfQuarter FinancialWeek ... FinancialDay FinancialDayOfYear

No

No

H (3,3) HH (4,4)   Month H H   QuarterH  H  H  H H 1   H H   H   6 H  H Kind (13,14) 6  (28,31) WeekInM Name  H  HH No H  HH H  H   (7,7)  (1,1)  Day H (0,1) Holiday H WeekH H H H  H Weekday  - H  H   H   H  H H HH  H H H H

Year No

No Name Fig. 5. Extended HERM Schema and OLAP Representation of Time

ilies of views. The family is generated in dependence of a given generalisation hierarchy. For instance, given the hierarchies Product  Brand  Manufacturer and Shop  Area  Region, we can create a query on shops and products. This query can be contracted according to the hierarchies to queries on brands, regions, etc. With regard to being de ned as views on operative database schemata, snow ake schemata do not add additional complexity since they simply correspond to some kind of normalisation along functional or multi-valued dependencies. The same normalisation can be expected in the underlying operative databases.

3 Modelling OLAP Applications Now we turn to the problem of how to realise OLAP functionality in management support systems. Since OLAP applications constitute particular dialogue oriented information systems we should preserve the technology developed for such information systems [2, 13{15] as much as possible. We start looking at the presentation of facts in sections of the data warehouse followed by an investigation of typical actions involved in such presentations. This will lead us to a short discussion of dialogue objects and their suitability for OLAP. Finally, we discuss the tuning of presentation granularity which will constitute further operations.

3.1 Presentation of Facts The typical OLAP scenario assumes a manager to query the warehouse. Since familiarity with sophisticated query languages cannot be presumed, such a query should be expressible in terms understandable to the user, i.e.: { The manager selects the facts (s)he wants to consider. E.g., in our grocery store example (s)he may choose to look only at money sales perhaps with a percentage-based comparison to an earlier period. { The manager selects the dimensions and their granularity along which the selected facts should be presented. In our grocery store example this can be time, e.g., on the basis of last year's quarters, individual products and regions. { In addition to the dimensions a speci cation of their relevant descriptive attributes is needed. E.g., it may be sucient to get the product name and the grammalogues for the region and quarter. { The manager states additional selection criteria to restrict the data to be presented. E.g., only stores that belong to a certain category (cheap or expensive equipment, etc.) or products o ered in the most recent promotion campaign should be considered. { Criteria for the grouping and ordering of the facts and dimension attributes should be given. { Finally, the presentation of the result should be speci ed. E.g., tabular or graphical presentations such as scroll lists, beam or tart diagrams should be available. The data to be presented constitute a view on the warehouse. Since we argued that the warehouse itself can be realised as a view on the operative multidatabase, each data selection in an OLAP application constitutes also a view on the operative databases. Whether the intermediate warehouse level and maybe also other views are materialised or not depends on technical aspects. The manager need not be aware of any connection to the underlying operative databases. (S)he may not even be aware of snow ake schemata if these have been designed, since the star schema is also a view on the snow ake schema. Finally, whether the star schema is presented to managers directly by the entityrelationship schema or simply by the list of available facts, dimensions, larger dimensions and dimension attributes, can be left to the manager's preferences. Since the queries underlying the views are conjunctive a simple QBE like entry form will be sucient. Thus, it is recommended to couple the view with actions to select facts, dimensions, dimension attributes, to add restricting conditions and to choose presentation preferences. We may therefore consider the presented data together with the actions on them as describing an object, namely a dialogue object (d-object). More formally, a dialog object (d-object) consists of a unique abstract identi er, a set of values 1 in associated elds 1 which correspond to describing values of d-objects, a set of references to other d-objects in order to allow quick navigational access, a set of actions to change the data and to v ; : : : ; vn

F ; : : : ; Fn

control the dialogue, and a state with the possible values `active' and `inactive'. This means that d-objects only exist as long as they are visible on the screen. If a window is closed the corresponding d-object ist deleted. The identi er serves the purpose of administrating d-objects. It is not known to and cannot be used by the user and is not visible. Only the active d-object allows manipulations of the represented data and only its actions can be invoked. Users invoke actions to change selection criteria, to navigate to another (possibly new) dialogue object or to a modi ed presentation of the same dialogue object. This kind of system usage constitutes the basic object-action-principle in dialogue systems. Users enter or select values on the screen and invoke actions| usually grouped in menus|on them. The dialogue system reacts by o ering other data or by activating and deactivating entries in selection lists or possible actions in the action bar. In graphical user interfaces data are normally presented in a window. Once we know that the data in a d-object may be used in the next dialogue step, it may be helpful to provide also hidden data. Their presence may quicken the access to the database or the navigation to other d-objects. Depending on selections or entries made in a d-object only some of the possible actions may be allowed. The processing of an action may require further preconditions depending on the state of the dialogue system especially on other users' d-objects. Note that the selection of grouping, ordering and screen presentations does not a ect the dialogue object itself. Thus, presentational aspects can be treated separately from the core of data processing.

3.2 Dialogue Types Conceptually, we are not interested in individual d-objects but want to classify them into types. We shall, therefore, talk about dialogue types (d-type). Dialogue types unify structural, behavioural and presentational aspects of an application by combining view de nitions, action speci cations and rules for presenting an dialogue object's contents to the user [16]. Structural aspects. At the heart of such a d-type we provide a view consisting of a (higher-order) ER schema and a de ning query. The schema is the output schema S mentioned in the previous section; the de ning query is the sequence of query language statements which create instances of S . The input schema S can be omitted. It is generally assumed that the warehouse schema|star or snow ake schema|is taken for this purpose. The view de nition may be parameterised. Parameters are either speci ed as defaults in the d-type de nition or can be modi ed by the user during interaction. In addition, as indicated above, we may choose a subset of the query result as the data to be actually presented while keeping the rest for fast support of operation. This constitutes the visual schema VS as a subschema of S . The view de nition of the visual schema may again be parameterised. Visibility functions determine the respective parameters on the basis of default de nitions, actions invoked by the user and/or explicit user speci cations. out

V

out

in

out

out

Building upon the grocery store warehouse from Figure 1 the d-type may, e.g., involve money sales per shop and region, but only the money sales per region will be presented to the user. Then the output schema S could be as presented in Figure 6 and the visual schema VS would be the subschema in which the relationship type Sales per Shop and the entity type Shop have been omitted (cf. Figure 6). We abstain from a detailed description of the de ning query which would use the star schema in Figure 1 as the input schema S . out

out

in

Region



HHH Sales per H   H HH Region  H

Region SID Town

Shop



-

Time

Year Quarter

6 HHH Sales per H  H  HHShop H

Money sales

Fig. 6. Grocery Store: Schema underlying a D-Type Behavioural aspects. Attached to the view we provide a collection of operations such as those discussed in Section 3.3. Such actions allow the user to navigate within the view (e.g., to formerly hidden parts), to navigate between views and to switch between di erent presentations of data. In addition to typical OLAP actions as described below, help functions can be made available, too. Similar to the structural part that contains both visible and hidden data, dialogue actions are not necessarily accessible at all times. Availability of actions is controlled by action access functions which determine availability on the basis of the user's access rights, etc. For instance, based on the example in Figure 6, adding the fact Money sales from the entity type Sales per Shop might only be possible for managers on a high enough level of the enterprise's hierarchy. Similar restrictions could exist for drill-down operations in certain dimensions. To guarantee stable performance of the application exceptions are de ned for the case that the use of a dialogue object causes problems. Presentational aspects. Both data and actions|as long as they are not hidden| must be made available to the user. Presentation rules are generally de ned globally for the overall application|independently from individual dialogue types. Such rules determine, e.g., graphical widgets to be used for visualisation (buttons or menus for actions, text elds for textual data, etc.). Furthermore, they can control the layout of visualisations based, e.g., on the space available on the screen and on the information to be conveyed.

D-types will, however, provide parameters for such presentation rules. Parameters are, e.g., labels that are to be used for visualising dialogue objects (title bar) and individual data or actions. Other parameters are diverse semantical properties of data such as emphasis, priority, adhesion, etc. [8]. If statistical anomalies are discovered during the execution of a dialogue object (a shop whose sales signi cantly exceed the average) local presentation rules can, e.g., characterise such anomalies as important. Global presentation rules will then use the parameter important to choose red colour, ashing font, etc. Conceptually, presentation design of d-types is therefore concerned with a description of the data and actions involved and/or with the speci cation of rules that automatically compute such descriptions, rather than with an explicit assignment of presentations. We specify labels, semantical properties, adhesions between data items, rules, etc. but no concrete widgets, colours or fonts. At the implementational level, presentation rules use these descriptions to create physical presentations of dialogue objects. There can be several presentation rules available for one concept which can, e.g., create a graphical or a forms-based presentation for the same d-type. Presentation rules are selected at system run-time in dependence of the user's preferences or actions invoked.

3.3 OLAP Actions In OLAP applications, roll-up operations and drill-down operations are used for generalisation and specialisation of fact tables. Aggregation of detailed data to create summary data is called roll-up. Usually, roll-up is based on two operations: grouping of data according to some characteristics (e.g., total sales by city and product) and navigation through an attribute hierarchy (e.g., from sales by city towards sales by state and sales by country). Navigation to detailed data from summaries is called drill-down. It provides the data set that was aggregated (e.g., displaying the `base' data for total sales gure for the state CA). Selection which is called slicing in OLAP applications de nes a subdatabase (e.g., sales where city=\Berkeley" or reducing the relationship dimensions by specifying coordinates of remaining dimensions). The last observation shows that minimum and maximum functions yield the same result after application of roll-up operations or drill-down operations. The same property is valid for pull, pivoting and push operations [1]. These functions can be generalised to reordering or rearrangement functions. OLAP operations can be completely based on operations de ned for ER models [18].

{ Classical database operations are also applicable to views. For this reason, the following functions are de ned for views:  Selection on relationship types is the ER expression for dice .  The slice operation is expressed by projection on relationship types.  The set-theoretic operations union, intersection, cartesian product and di erence and component renaming are elements of the ER algebra.

{ Calculations within one component type or across component types are ex-

pressed by algebraic functions. Ranking functions are based on the computation of sets, ordering and creating supporting views. Extensions such as tertile, quartiles, ratio to report, cume, moving average, moving sum are expressible in ER-SQL [18]. { Visualisation functions do not produce another view. They are used for reordering the schema. Functions such as nest and rotate can be represented by ER operations. Ranking functions are expressed by order by constructs. Dynamic breakpoints are used to start a new view for computation. They require the utilisation of dynamic SQL. Extending ER-SQL to dynamic ERSQL can be performed in the fashion known for SQL. { Aggregation can be expressed by view de nition in database languages.  The roll-up function (also called drill-up) is used for dimension reduction on the basis of aggregation functions. Navigation through an attribute hierarchy can be expressed by escorting queries. The complex cube operation is a set of roll-up operations and expressible in ER-SQL.  The drill-down function is just the inverse operation of roll-up.  The aggregation functions min, max, sum, avg, count can be expressed by summarisation. This function generalises the aggregation functions. { Schema restructuring has been generalised by Gyssens and Lakshmanan [5]. Classi cation, fold and unfold can also be generalised in the ER algebra. Unfold de nes a new view on a relationship type and a set of components of the relationship type by introducing a new type on the set and a new relationship type with the remaining components of the rst type and the set component. Unfold generalises the unnest operation. Fold restructures a schema to a new schema for a component of a choosen relationship type. Fold generalises the nesting operation. Classi cation is a speci c grouping operation. Schema restructuring operations are operations which can be expressed by graph grammar rules. { OLAP applications are also based on analytical functions which are de ned on the basis of mathematical models. Since OLAP data can be understood as derived data which can be materialised, analytical functions can also be de ned for databases. From the user's point of view typical actions invoked|besides choosing a completely new view in the sense presented above|are devoted to achieve ner or coarser tuning of the presented information: { A user may choose to extend or restrict the fact list. In the example sketched in Section 3.2, the facts Quantity and Pro t may be added. Note that the evaluation of the corresponding query may exploit data already presented as an intermediate result. { A user may choose to add or remove a dimension. For example, a manager looking at the data selection indicated above may remove the product dimension and simply look at Money sales in di erent regions. Again, query evaluation is simpli ed by the existing data contents of the dialogue object.

{ A user may add or remove dimension attributes. The required processing is analogous to changes in the fact list. { The user may change dimension granularity. Such drill-down or roll-up op-

erations switch to a smaller or larger dimension. For example, the user may replace Region by Shop or Product by Brand walking along the hierarchies in the snow ake schema. { The selection criteria may be weakened or strengthened leading to a larger or smaller result. Here again, the reuse of the existing query result may help to fasten query processing. { The user may mark a section of the presented data and invoke an aggregate function on them such as summation, average computation, normalising at a given index 100, etc. In our grocery store example the money sales per shop could be considered. { Grouping, ordering and presentation style may be changed. As already stated above, this does not lead to a new query and only a ects general presentational issues. It is possible to achieve this functionality by using di erent presentation rules.

4 Conclusion In this paper we argued that a conceptual view of data warehouses and management support information systems with OLAP functionality can be achieved by exploiting the idea of dialogue objects which was coined in the context of dialogue information systems supporting high-skilled clerical work. Firstly, the warehouse schemata as well as the data presented to managers can be derived as views on the operative database(s). Secondly, the functions required in OLAP applications can be coupled with these views. We remark that the idea of extended views has also been fruitfully expanded and applied in theory and practice to support the design of database-backed information services on the WWW [2]. The fact that users of information services are by no means better trained in computer usage than managers underlines the appropriateness of the suggested approach. How to realise the underlying views to enable fast access is left as a technical question of which views or intermediate views should be materialised or not. Currently, it is assumed that the data warehouse itself is materialised as a rst large view but views on the warehouse are not. We would like to emphasise that this is only one possible solution. Similarly, from dialogue objects we inherit the methodological separation of data contents on the one side and its presentation on the other. Hence, our suggested approach allows a threefold split in the design of OLAP applications: { a conceptual task oriented towards the design of dialogue structures, i.e., views, visibility of views, actions and availability of actions; { a more technical task to de ne suitable view realisation strategies; { a general design task to specify presentation rules for the creation of suitable screen representations for view data and actions.

Naturally, the individual design tasks can be performed by di erent designers being experts in the respective domain, i.e., application design, database management and interface design.

References 1. R. Agrawal, A. Gupta, and S. Sarawagi. Modeling multidimensional database. In Proc. Data Engineering Conference, Birmingham, pages 232{243, 1997. 2. T. Feyer, K.-D. Schewe, and B. Thalheim. Conceptual design and development of information services. In T.W. Ling, S. Ram, and M.L. Lee, editors, Conceptual Modeling { ER '98, LNCS 1507, pages 7{20. Springer, 1998. 3. C. Floyd, F.-M. Reisin, and G. Schmidt. STEPS to software development with users. In C. Ghezzi and J.A. McDermid, editors, ESEC'89, LNCS 387, pages 48{64. Springer, Berlin, 1989. 4. P. Gluchowski. Data warehouse. Informatik Spektrum, 20(1):48{49, 1997. 5. M. Gyssens and L.V.S. Lakshmanan. A foundation for multidimensional databases. In Proc. 22nd VLDB Conference, Mumbai (Bombay), India, 1996. 6. B. Jahnke, H.-D. Gro mann, and S. Kruppa. On-line analytical processing (OLAP). Wirtschaftsinformatik, 38(3):321{324, 1996. 7. R. Kimball. The Data Warehouse Toolkit. John Wiley & Sons, 1996. 8. J. Lewerenz. On the use of natural language concepts for the conceptual modeling of interaction in information systems. In Proc. of NLDB'99, Klagenfurt, 1999. 9. A. Motro. Superviews: Virtual integration of multiple databases. IEEE ToSE, 13(7), 1987. 10. H. Mucksch, J. Holthuis, and M. Reiser. Das Data-Warehouse-Konzept { Ein U berblick. Wirtschaftsinformatik, 38(4):421{433, 1996. 11. S.B. Navathe, R. Elmasri, and J.A. Larson. Integrating user views in database design. IEEE Computer, 19(1):50{62, 1986. 12. N. Pendse. The OLAP Report. Available through http://www.olapreport.com/. 13. B. Schewe. Kooperative Softwareentwicklung { Ein objektorientierter Ansatz. Deutscher Universitatsverlag, Leverkusen, 1996. 14. B. Schewe and K.-D. Schewe. A user-centered method for the development of dataintensive dialogue systems { an object oriented approach. In E.D. Falkenberg, W. Hesse, and A. Olive, editors, Information System Concepts, pages 88{103. Chapman & Hall, 1995. 15. K.-D. Schewe and B. Schewe. View centered conceptual modelling { an object oriented approach. In B. Thalheim, editor, Conceptual Modeling { ER '96, LNCS 1157, pages 357{371. Springer, 1996. 16. K.-D. Schewe and B. Schewe. Integrating database and dialogue design. Knowledge and Information Systems, 1999. To appear. 17. B. Thalheim. Foundations of entity-relationship modeling. Annals of Mathematics and Arti cial Intelligence, 7:197{256, 1993. 18. B. Thalheim. Fundamentals of entity-relationship modeling. Springer, Heidelberg, 1999. 19. E. Thomson. OLAP Solutions: Building Multidimensional Information Systems. John Wiley & Sons, New York, 1997.

Suggest Documents