Disc storage is cheap and offers large volumes, data warehouses are widespread. ..... Deleted data recovery is a further problem. If the object data in the main ...
Temporal data modelling Conventional and temporal table
Michal Kvet, Anton Lieskovský, Karol Matiaško Department of Informatics, Faculty of Management Science and Informatics University of Žilina Žilina, Slovakia {Michal.Kvet, Anton.Lieskovsky, Karol.Matiasko}@fri.uniza.sk Abstract— The need of today is to have not only actual data – data that are valid in this moment, but also historical data, by which the progress and frequency of changes can be monitored. It offers creating future prognoses. Temporal tables (mostly modelled using uni-temporal and bi-temporal tables) in comparison with conventional tables can process and retain the information in the past. Using procedures and functions, snapshot of the database at any point in past can be reconstructed easily. Thus, each database object is not represented by only one row, but by the set of rows with limited validity.
each actually existing object is represented by one row in the table. Database does not reflect changes of attributes, it cannot be said when (or if any) the data will be changed or even the object deleted. Consequently, the user does not have any information about the existence of an object in the database after request for deleting. Archives monitoring or table consisting of manipulation data can only partially solve the problem. Figure 1 shows the standard conventional table; „ID” represents the primary key. Rest columns are grouped to the common field named „data“. [4] [6]
Conventional table; temporal table; valid time; temporal data modeling; snapshot of the database;
Figure 1. Conventional (non-temporal) database table model
I.
INTRODUCTION
Lots of things are important to us; data about them can be stored in the databases. Object in conventional (non-temporal) table is represented by one row – current state. The change of the attribute value causes update of the row and the old data are deleted. However, each value has its own history – progress. It can be useful to store and monitor historical values. That is the reason of creating new concept of the database. Historical data were considered as the backups and log files. Users used to ignore them. If they had to use them, it meant that error occurred and the last functional state of the database (snapshot) should be loaded. Temporal database concept was first presented in the early 1980s. Research in this area, however, was rated as unsatisfactory due to time consumption, requirement of the large volume of disc storage. Current situation in IT is a bit different. Disc storage is cheap and offers large volumes, data warehouses are widespread. However, the main reason is the need for historical data processing, necessity for data analysis making them root for optimization and cost reduction. Recording history can also help us make important business decisions (e.g. products with a lot of spoils will not be bought in the future, or fired employee cannot be hired in the future). [2] II.
EASE OF USE
A. Conventional table Conventional database system is currently the most often used type. It consists of tables storing only actual data. Thus,
B. Versioned table Versioned table uses the advantages of the standard approach but also eliminates the problems of the time-validity of the objects. It offers not only possibility of change monitoring, but also creating future prognoses. In a conventional table, each state of an object is represented by one row during the time. However, versioned table uses various number of rows for one object during the life - cycle. General agreement considering type of time (discrete, continuous) is a discrete model, which is suitable and adequate. Atomic clock tick (chronon) represents the smallest interval of time recognized by the database system – the smallest time which must elapse between two states of an object - physical modifications of the database. Clock time tick can be defined at any level of granularity. Typical example of clock tick is a timestamp. One day clock tick means the database is updated from the collection of transactions once a day. The question now is when does that day start? When does 2-12-2012 start? At 2-12-2012-12:00:00.000 or 2-12-2012-12:00:00.001? Or even at 2-11-2012-11:59:59:999? The simplest solution is to let the DBMS determine the mapping for us. [2] [6] Time in the database can be represented as a sequence of chronons but also by a pair of the starting and terminating chronon. Each object must have only one representation of attributes at the same time. The representation defining all chronons is showed on the next figure, object identified by ID=3 has been inserted at T4; T5 causes problem due to incorrect (duplicate) state of the same object. [6]
The biggest advantage of this approach is recording the history and covering reality. But it also brings a lot of problems such as bad internal representation, larger databases, a lot of non-actual data. In the past, temporal databases had extreme position – once inserted value should not be deleted.
B. Closed – open representation Closed-open representation is a convention for using a pair of clock ticks to designate time period, in which the earlier clock tick is the first clock tick of the validity (BD) and the second one (ED) is the first clock tick after the last clock tick in the time period (validity). [4] Query is defined like this: SELECT data FROM table_name WHERE BD