... and C. Begg, Database Systems: A Practical Approach to Design,
Implementation, ... R. Elmasri and S. B. Navathe, Fundamentals of Database
Systems, 5th ...
ITS322 - Database Management Systems
Semester I/2008
Lecture 0 Schedule of ITS322 Database Management Systems Dr. Thanaruk Theeramunkong Sirindhorn International Institute of Technology Thammasat University T. Connolly, and C. Begg, Database Systems: A Practical Approach to Design, Implementation, and Management, 4th edition, Addison-Wesley, 2004, ISBN: 0-321-21025-5. R. Elmasri and S. B. Navathe, Fundamentals of Database Systems, 5th edition, Pearson, 2007, ISBN: 0-321-41506-X ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
1
Textbooks Database Systems: A Practical Approach t Design, to D i Implementation, I l t ti andd Management By T. B T Connolly, C ll andd C. C Begg, B 4 h edition, 4th di i Addison-Wesley, 2004 ISBN: 0-321-21025-5. 0-321-21025-5
Fundamentals of Database Systems By R. B R Elmasri El i andd S. S B. B Navathe, N th 5th edition, diti Pearson (Addison & Wesley), 2007, ISBN: 00-321-41506-X 321 41506 X
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
2
Office Hours and Gradingg
Lecture Time: Section 1 Thursday 9:00-10:20 (BKD2401) Friday 13:00-14:20 (BKD2401) Section S ti 2: 2 Thursday 10:40-12:00 (BKD2602) Friday 14:40-16:00 14:40 16:00 (BKD2602) Office hours: Thursday (13:00-16:00) & Friday (9:00-12:00) Grading: Quiz/Attendance (10%) Project (20%) Midterm Examination E amination (30%) (31 July J l 2008, 2008 9:00-12:00) 9:00 12:00) Final Examination (40%) (9 October 2008, 9:00-12:00)
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
3
Course Outline No.
Date
1
12, 13 June Introduction to DBMS & SQL Data Manipulation (I)
1, 2, 13
2
19, 20 June Introduction to DBMS & SQL Data Manipulation (II)
1, 2, 14
3
26, 27 June SQL Data Manipulation (III)
14
4 5 6 7 8 9
Topics
3, 4 July Data Definition & Relational Algebra 10 11 July Entity 10, Entity-Relationship Relationship Modeling 17 July Enhanced Entity-Relationship Modeling 24, 25 July Normalization, Review and QA 31 July Midterm Examination 7, 8 August DB Planning, Design and Administration
Chapters
3 5 5 6 1, 2, 3, 5, 6, 13, 14 4
10
14,, 15 August g SQL Q Programming: g g Stored Procedure,, Trigger gg
Extra 24 ((Elmasri))
11
21, 22 August Project presentation
Group work
12
28, 29 August Transaction Processing
17
13
4 5 September 4, S b Q Query P Processing i
18
14
11, 12 September Disk Size Estimation, Disk Storage and Indexing
13, 14 (Elmasri)
15
18, 19 September Project presentation
Group work
16
25, 26 September Review and QA
21
17
ITS322 - DBMSs 9 October
Lecture 1: Introduction to DBs and DB Env. Final Examination
4, 17, 18, 21 4
ITS322 - Database Management Systems
Semester I/2008
Lecture 1 Introduction to Databases Systems
Dr. Thanaruk Theeramunkong Sirindhorn International Institute of Technology Thammasat University R. Elmasri and S. B. Navathe, Fundamentals of Database Systems, 5th edition, Pearson, 2007, ISBN: 0-321-41506-X T. Connolly, and C. Begg, Database Systems: A Practical Approach to Design, Implementation, and Management, 4th edition, Addison-Wesley, 2004, ISBN: 0-321-21025-5. ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
5
Objectives j
Some common uses of database systems The characteristics of file-based systems The problems with the file file-based based approach The benefits of database approach Th meaning The i off the h terms database, d b database d b systems, database management system (DBMS) The typical functions of a DBMS The advantages and disadvantages of DBMSs
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
6
Objectives j
The major components of the DBMS environment The h personnell involved i l d in i the h DBMS environment i Difference between data administration and database administration d i i i Types of database systems System Catalog and Information Resource Dictionary System (IRDS) Purposes and the origin of the 3-level database architecture Concepts and types of data models Functions and components p of a DBMS
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
7
Data Versus Information
Data constitute building blocks of information Information produced by processing data Information reveals meaning of data Good, timely, relevant information key to decision making Good decision making key to organizational survival
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
8
Where is Database?
The database (DB) is now such an integral part our day-to-day day to day life that often we are not aware we are using one. E supermarket, Ex: k t credit dit card, d travel t l agent, t library, lib insurance, security systems, university. First applications focused on clerical tasks Requests for information quickly followed File systems developed to address needs
Data organized according to expected use Data Processing (DP) specialists computerized manual file systems
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
9
Types yp of Databases and DB Applications pp
Traditional Applications:
Numeric and Textual Databases
More Recent Applications: pp
Multimedia Databases Geographic Information Systems (GIS) Data Warehouses Real time and Active Databases Real-time Many other applications
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
10
File-based Systems y
The file-based system is the predecessor of the d t b database system. t Æ Decentralized D t li d A collection of application programs that perform services for the end users (e.g. reports). Each program defines and manages its own data. data File-based systems were an early attempt to computerize t i the th manuall filing fili system. t The related topics: storage, security, indexing, cross-reference, processing
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
11
Simple p File-based System y
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
12
File-based Processingg
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
13
File-based System y Critique q ((I))
File-based System Data Management
Requires extensive programming in third-generation language (3GL) Time consuming Makes ad hoc queries impossible Leads to islands of information
Data Field Record File
ITS322 - DBMSs
Raw Facts Group of characters with specific meaning Logically connected fields that describe a person, place, or thing Collection of related records
Lecture 1: Introduction to DBs and DB Env.
14
File-based System y Critique q ((II))
Data Dependence
File structure is defined in the program code. Change in file’s data characteristics requires modification of data access programs Must tell program what to do and how Makes file systems cumbersome from programming and data management views
Structural Dependence
Change in file structure requires modification of related programs
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
15
File-based System y Critique q ((III))
Field Definitions and Naming Conventions
Flexible Fl ibl record d ddefinition fi iti anticipates ti i t reporting ti requirements Selection of proper field names important Attention to length of field names Use of unique record identifiers
Data Redundancy
Different and Diff d conflicting fli i versions i off same data d Results of uncontrolled data redundancy
Data anomalies D li Data inconsistency
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
16
File-based System y Critique q ((IV))
Separation and isolation of data
Incompatible file formats
Each program maintains its own set of data. Users of one program may be unaware of potentially useful data h ld by held b other th programs. Programs are written in different languages, and so cannot easily access each others files.
Fixed Queries/Proliferation of application programs
Programs are written to satisfy particular functions. Any new requirement needs a new program. program
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
17
Database Approach pp
Arose because:
Definition of data was embedded in application programs, rather than being stored separately and i d independently. d tl No control over access and manipulation of data b beyond d that h imposed i d by b application li i programs.
Result - the database and Database Management System (DBMS).
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
18
Database Management g
Database is shared, integrated computer structure h i housing:
End user data Metadata
Database Management System (DBMS)
Manages Database structure Controls access to data Contains query language
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
19
Database
A shared collection of logically related data (and a description d i ti off this thi data), d t ) designed d i d to t meett the th information needs of an organization. System catalog (data dictionary or metadata) pprovides the description p of the data to enable program–data independence. Logically related data comprises entities, entities attributes, and relationships of an organization's i f information. ti
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
20
Database Systems y & DBMS
Database System
A system that occupies a database as a basic storage Provides the following advantages over file-based systems
Eliminates inconsistency, data anomalies, data dependency, andd structural t t l dependency d d problems bl Stores data structures, relationships, and access paths
D b Database M Management S Systems (DBMS)
A software system that enables users to define, create, and maintain the database and which provides controlled access to this database.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
21
Simplified p Database System y Environment
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
22
Database vs. File Systems y
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
23
DBMS Manages g Interaction
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
24
Database Management System (DBMS)
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
25
Typical yp DBMS Functionality y
Define a particular database in terms of its data types, structures and constraints structures, Construct or Load the initial database contents on a secondary storage medium Manipulating the database:
Retrieval: R t i l Querying, Q i generating ti reports t Modification: Insertions, deletions and updates to its content Accessing the database through Web applications
Processing and Sharing by a set of concurrent users and application programs – yet, yet keeping all data valid and consistent
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
26
Typical yp DBMS Functionality y
Other features:
Protection or Security measures to prevent unauthorized access “Active” processing to take internal actions on data Presentation and Visualization of data Maintaining the database and associated programs over the lifetime of the database application
Called database, software, and system maintenance
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
27
Functions of a DBMS (I) ()
Data Storage, Retrieval and Update.
A User-Accessible Catalog.
Must furnish users with the ability to store, retrieve, and update data in the database. Must furnish a catalog g in which descriptions p of data items are stored and which is accessible to users.
Transaction Support
Must furnish a mechanism to ensure that either all the updates corresponding to a given transaction are made or that none of them are made.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
28
Functions of a DBMS (II) ( )
Concurrency Control Services
Recovery Services
Must furnish a mechanism to ensure that database is updated correctly when multiple users are updating the d t b database concurrently. tl Must furnish a mechanism for recovering the database in the event that the database is damaged in any way.
Authorization Services & Security management
Must furnish a mechanism to ensure that only authorized users can access the database.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
29
Functions of a DBMS (III) ( )
Support for Data Communication
I t it Services Integrity S i & Security Sec rit management
Must furnish a means to ensure that both the data in the database and changes to the data follow certain rules. rules
Services to Promote Data Independence
Mustt bbe capable M bl off integrating i t ti with ith communication i ti software.
Must include i l d facilities f ili i to support the h independence i d d off programs from the actual structure of the database.
Utilit Services Utility S i
Should provide a set of utility services.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
30
Functions of a DBMS (IV) ( )
Data transformation and presentation Backup and recovery management Database language and application programming interfaces A view mechanism. Provides users with only the data they want or need to use.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
31
Components p of a DBMS
1. Query processor 2. Database manager (DM) 3. File manager 4 DML preprocessor 4. 5. DDL compiler 6. Catalogg manager g ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
32
Components of Database Manager (DM) 1. Authorization control 2. Command processor 3 Integrity 3. I t it checker h k 4. Query optimizer 5 Transaction manager 5. 6. Scheduler 7. Recovery manager 8. Buffer manager
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
33
Advantages g of Using g DB Approach pp (I) ()
Controlling redundancy in data storage and in d l development t andd maintenance i t efforts. ff t
Restricting unauthorized access to data. Providing persistent storage for program Objects
Sharing of data among multiple users.
Object-oriented DBMSs
Providing P idi Storage S S Structures ((e.g. iindexes) d ) for f efficient Query Processing
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
34
Advantages g of Using g DB Approach pp (II) ( )
Providing backup and recovery services. Providing multiple interfaces to different classes of users. Representing complex relationships among data. Enforcing integrity constraints on the database. database Drawing inferences and actions from the stored data using deductive and active rules
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
35
Additional Implications p of DB Approach pp (I) ()
Potential for enforcing standards:
This is very crucial for the success of database applications in large organizations. Standards refer to data item names, display formats, screens, report structures, meta-data (description of d ) W data), Webb page layouts, l etc.
Reduced application development time:
Incremental time to add each new application is reduced.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
36
Additional Implications p of DB Approach pp (II) ( )
Flexibility to change data structures:
Availability of current information:
Database structure may evolve as new requirements are defined. Extremelyy important p for on-line transaction systems y such as airline, hotel, car reservations.
Economies of scale:
Wasteful overlap of resources and personnel can be avoided by consolidating data and applications across departments.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
37
Disadvantages g of DBMS
Complexity Size i Cost of DBMS Additional hardware costs Cost of conversion Performance Higher impact of a failure
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
38
When not to use a DBMS
Main inhibitors (costs) of using a DBMS:
High g initial investment and possible p need for additional hardware. Overhead for providing generality, security, concurrency control, recovery, and integrity functions.
When a DBMS may be unnecessary:
If the database and applications are simple, well defined, and not expected to change. If there are stringent real-time requirements that may not be met because of DBMS overhead. overhead If access to data by multiple users is not required. If the database system is not able to handle the complexity of data because of modeling limitations If the DB users need special operations not supported by the DBMS.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
39
Example p of a DB
Some mini-world relationships:
SECTIONs are off specific SECTION ifi COURSEs COURSE STUDENTs take SECTIONs COURSEs have prerequisite COURSEs INSTRUCTORs teach SECTIONs COURSEs are offered by DEPARTMENTs STUDENTs major in DEPARTMENTs
Note: The above entities and relationships are typically expressed in a conceptual data model, model such as the ENTITY-RELATIONSHIP data model (learn more later)
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
40
A Simple p Database (I) ()
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
41
A Simple p Database (II) ( )
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
42
Main Characteristics of the DB Approach pp
Self-describing nature of a database system:
A DBMS catalog l stores the h ddescription i i off a particular i l database (e.g. data structures, types, and constraints) The description is called meta-data. meta data This allows the DBMS software to work with different database applications applications.
Insulation between programs and data:
Called C ll d program-data d t independence. i d d Allows changing data structures and storage organization without having to change the DBMS access programs programs.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
43
A Simplified Database Catalog
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
44
Main Characteristics of DB Approach pp (I) ()
Data Abstraction:
A data model is used to hide storage details and present the users with a conceptual view of the database. Programs g refer to the data model constructs rather than data storage details
Support of multiple views of the data:
Each user may see a different view of the database, which hi h ddescribes ib only l the th data d t off interest i t t to t that th t user.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
45
Main Characteristics of DB Approach pp (II) ( )
Sharing of data and multi-user transaction processing: i
Allowing a set of concurrent users to retrieve from and to update the database. Concurrency control within the DBMS guarantees that each transaction i is i correctly l executedd or aborted b d Recovery subsystem ensures each completed transaction has it effect its ff t permanently tl recorded d d in i the th database d t b OLTP (Online Transaction Processing) is a major part of database applications applications. This allows hundreds of concurrent transactions to execute per second.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
46
DBMS Environment (Major components)
Hardware Can C range ffrom a PC tto a network t k off computers. t Software DBMS, operating system, network software (if necessary) and also the application programs. Data Used by the organization and a description of this data called the schema. Procedures Instructions and rules that should be applied to the design and use of the database and DBMS. People
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
47
Database Users
Users may be divided into Actors on the Scene
Those who actually use and control the database content, g , developp and maintain database and those who design, applications.
Data Administrator (DA) Database Administrator (DBA) Database Designers (Logical and Physical) Application Programmers End Users (native and sophisticated)
Workers Behind the Scene
Those who design and develop the DBMS software and related tools, tools and the computer systems operators. operators
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
48
Users in DB System Environment
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
49
Data Administration vs. Database Administration
Data Administration
The management of the data resource, which includes database planning, development and maintenance of standards, d d policies li i andd procedures, d andd conceptuall andd logical database design.
D b Database Ad Administration i i i
The management of the physical realization of a database application, which includes physical database design and implementation, setting security and integrity controls, monitoring system performance, and reorganizing the database, as necessary.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
50
Data Administration Tasks
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
51
Database Administration Tasks
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
52
DA and DBA – Main Task Differences.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
53
Database System y Types yp
Single-user vs. Multi-user Database
Centralized vs. Distributed Usage g Purpose p
Desktop Workgroup Enterprise
Production or transactional Decision support or data warehouse
Multi-user DBMS Architecture
Teleprocessing Fil File-server Client-server
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
54
Teleprocessing p g
Traditional architecture. Single mainframe with a number of terminals Trend is now towards downsizing.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
55
File-server
File-server is connected to several workstations across a network. t k Database resides on file-server. DBMS and applications run on each workstation. Disadvantages include:
Significant network traffic. Copy of DBMS on each workstation. Concurrency, recovery and integrity control more complex.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
56
File-server Architecture
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
57
Client-server
Server holds the database and the DBMS. Client manages the user interface and runs applications. Advantages include:
Wider access to existing databases. databases Increased performance. P ibl reduction Possible d ti in i hardware h d costs. t Reduction in communication costs. Increased consistency.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
58
Client-server Architecture
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
59
Alternative Client-server Topologies
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
60
Summary of Client-server Functions
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
61
DBMS Server
Provides database query and transaction services to the clients Relational DBMS servers are often called SQL servers, queryy servers,, or transaction servers q Applications running on clients utilize an Application Program Interface (API) to access server databases via standard interface such as:
ODBC: Open Database Connectivity standard JDBC ffor JJava programming JDBC: i access
Client and server must install appropriate client module and server module software for ODBC or JDBC See Chapter 9
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
62
Two Tier Client-Server Architecture
A client program may connect to several DBMSs, sometimes ti called ll d the th data d t sources. In general, data sources can be files or other nonDBMS software that manages data. Other variations of clients are possible: e.g., e g in some object DBMSs, more functionality is transferred to clients including data dictionary functions, optimization and recovery across multiple lti l servers, etc. t
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
63
Three Tier Client-Server Architecture
Common for Web applications I t Intermediate di t L Layer called ll d Application A li ti Server S or Web W b Server:
Stores the St th webb connectivity ti it software ft andd the th business b i logic l i partt of the application used to access the corresponding data from the database server Acts like a conduit for sending partially processed data between the database server and the client.
Three-tier Architecture Can Enhance Security:
Database server only accessible via middle tier Clients cannot directly access database server
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
64
Three-tier client-server architecture
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
65
System y Catalogg
A repository of information (metadata) describing th data the d t in i the th database. d t b Typically stores:
Names of authorized users. Names of data items in the database. Constraints on each data item. Data items accessible byy a user and the type yp of access.
It is used by modules such as:
Authorization Control. Control Integrity Checker.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
66
Information Resource Dictionary System (IRDS)
Response to an attempt to standardize data dictionary interfaces. interfaces An IRDS is a software tool that can be used to control and document an organization’s organization s information resources. resources It provides a definition for the tables that comprise the data dictionary and the operations that can be used to access these tables. Obj ti Objectives:
Extensibility of data Integrity of data Controlled access to data
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
67
IRDS Services Interface
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
68
Three-Level Architecture of a DB system (Objective)
All users should be able to access same data. A user's ' view i iis iimmune to t changes h made d in i other th views. i Users should not need to know physical database storage d il details. DBA should be able to change database storage structures without ih affecting ff i the h users'' views. i Internal structure of database should be unaffected by changes to physical aspects of storage. DBA should be able to change conceptual structure of database without affecting all users.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
69
ANSI-SPARC Three-level Architecture
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
70
ANSI-SPARC Three-level Architecture
External Level
Conceptual Level
Users' view of the database. Describes that part of database that is relevant to a particular user. Community view of the database database. Describes what data is stored in database and relationships among the data.
I t Internal l Level L l
Physical representation of the database on the computer. Describes how the data is stored in the database.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
71
Difference between Three levels
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
72
Data Independence and the ANSI-SPARC 3 l l Architecture 3-level A hit t
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
73
Data Independence p
Logical Data Independence
The capacity Th it to t change h the th conceptual t l schema h without ith t having h i to t change the external schemas and their associated application p g programs. Conceptual schema changes e.g. addition/removal of entities. Should not require changes to external schema or rewrites of application programs.
Physical Data Independence
The capacity to change the internal schema without having to change the conceptual schema. Internal schema changes e.g. e g using sing different file organizations, organi ations storage structures/devices. Should not require change to conceptual or external schemas.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
74
Historical Development p of DB Tech. ((I))
Early Database Applications:
The Hierarchical Th Hi hi l andd Network N t k Models M d l were introduced i t d d in i mid id 1960s and dominated during the seventies. A bulk of the worldwide database processing still occurs using these models, particularly, the hierarchical model.
Relational Model based Systems: y
Relational model was originally introduced in 1970, was heavily researched and experimented within IBM Research and several universities. i ii Relational DBMS Products emerged in the early 1980s.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
75
Historical Development p of DB Tech. ((II))
Object-oriented and emerging applications:
Object-Oriented Obj t O i t d Database D t b M Managementt S Systems t (OODBMS (OODBMSs)) were introduced in late 1980s and early 1990s to cater to the need of complex p data pprocessing g in CAD and other applications. pp
Their use has not taken off much.
Many relational DBMSs have incorporated object database concepts, leading l di to a new category called ll d object-relational b l l DBMSs (ORDBMSs) Extended relational systems add further capabilities (e.g. (e g for multimedia data, XML, and other data types)
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
76
Historical Development p of DB Tech. ((III))
Data on the Web and E-commerce Applications:
Web W b contains t i data d t in i HTML (Hypertext (H t t markup k language) with links among pages. This has given rise to a new set of applications and EE commerce is using new standards like XML (eXtended Markup p Language). g g ) Script programming languages such as PHP and JavaScript allow generation of dynamic Web pages that are partially generated from a database. Also allow database updates through Web pages
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
77
Extendingg Database Capabilities p
New functionality is being added to DBMSs in the following areas: Scientific Applications pp XML (eXtensible Markup Language) Image Storage and Management Audio and Video Data Management Data Warehousing and Data Mining Spatial Data Management Time Series and Historical Data Management
The above gives rise to new research and development in incorporating new data types, complex data structures, new operations and storage and indexing schemes in database systems.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
78
Data Models (I) ()
Data Model: A set of concepts to describe the structure of a database, database the operations for manipulating these structures, and certain constraints that the database should obey. Data Model Structure and Constraints: Constructs are used to define the database structure Constructs typically include elements (and their data types) as well as ggroups p of elements ((e.g. g entity, y, record,, table), ), and relationships among such groups Constraints specify some restrictions on valid data; these constraints must be enforced at all times
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
79
Data Models (II) ( )
Collection of concepts for describing data, relationships between data and constraints on the data in an organization. Data Model comprises:
A structural part
A manipulative part
Consisting of a set of rules according to which databases can be constructed. Defining the types of operations that are allowed on the data (update/retrieving data from the DB or changing the DB structure).
Possibly a set of integrity rules
Ensuring that the data is accurate.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
80
Data Models (III) ( )
Data Model Operations:
These operations are used for specifying database retrievals and updates by referring to the constructs of th data the d t model. d l Operations on the data model may include basic model operations i ( (e.g. generic i insert, i delete, d l update) d ) andd userdefined operations (e.g. compute_student_gpa, update inventory) update_inventory)
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
81
Data Models ((IV)) – Levels of Data Models
Conceptual (high-level, semantic) data models:
Provide concepts that are close to the way users perceive data. data
Physical (low (low-level, level, internal) data models:
(Also called entity-based or object-based data models.)
Provide concepts that describe details of how data is stored in the computer. These are usually specified in an ad-hoc manner through DBMS design and administration manuals
Implementation (representational, logical) data models:
Provide concepts that fall between the above two, used by many commercial i l DBMS implementations i l t ti (e.g. ( relational l ti l data d t models d l used in many commercial systems).
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
82
Data Models ((V)) – Types yp of Data Models
Types of Data Models
Record-based R d b d Data D M Models d l Object-based Data Models Physical Data Models
The first two models are used to describe data at the conceptual and Logical levels, the latter is for the internal level.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
83
Data Models (VI) ( )
Record-based Data Models
Object based Data Models Object-based
Hierarchical Data Model Network Data Model Relational Data Model Entity-Relationship Object-Oriented j Semantic or Functional
Physical Data Models
Physical Ph i l data d t models d l describe d ib how h data d t is i stored t d in i the th computer, t representing information such as record structures, record orderings, and access paths Th are nott as many physical There h i l data d t models d l as logical l i l data d t models, the most common ones being the unifying model and the frame memory
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
84
Implementation p Data Models
1st generation
2nd generation
3nd generation ti
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
85
Hierarchical Data Model (I) ()
Initially implemented in a joint effort by IBM and N th American North A i Rockwell R k ll aroundd 1965. 1965 Resulted R lt d in i the IMS family of systems. IBM’s IMS product had (and still has) a very large customer base worldwide Hierarchical model was formalized based on the IMS system Other systems based on this model: System 2k (SAS inc.))
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
86
Hierarchical Data Model (II) ( )
Logically represented by an upside down tree
Eachh parentt can have E h many children hild Each child has only one parent
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
87
Hierarchical Data Model (III) ( )
Advantages:
Conceptual simplicity simplicity, simple to construct and operate Corresponds to a number of natural hierarchically organized domains, domains ee.g., g organization ((“org”) org ) chart Language is simple:
Uses constructs U t t lik like GET, GET GET UNIQUE, UNIQUE GET NEXT, NEXT GET NEXT WITHIN PARENT, etc.
Database security and integrity, integrity Data independence, independence Efficiency
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
88
Hierarchical Data Model (IV) ( )
Disadvantages:
Navigational and procedural nature of processing Database is visualized as a linear arrangement of records Little scope for "query optimization“ C Complex l implementation, i l i programming i andd use complexity Difficult iffi l to manage andd lack l k off standards d d Lacks structural independence Implementation limitations
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
89
Network Data Models (I) ()
The first network DBMS was implemented by H Honeywell ll in i 1964-65 1964 65 (IDS S System). t ) Adopted heavily due to the support by CODASYL (Conference on Data Systems Languages) (CODASYL - DBTG report of 1971). Later implemented in a large variety of systems IDMS (Cullinet - now Computer Associates), DMS 1100 (Unisys), IMAGE (H.P. (Hewlett-Packard)), VAX -DBMS (Digital Equipment Corp., next COMPAQ, now H.P.).
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
90
Network Data Model (II) ( )
Each record can have multiple parents
Composed of sets Each set has owner record and member record Member may have several owners
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
91
Network Data Model ((III)) – Another Example p
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
92
Network Data Model (IV) ( )
Advantages:
Able to model complex relationships and represents semantics of add/delete on the relationships. Can handle most situations for modeling using record types and relationship l i hi types. Language is navigational; uses constructs like FIND, FIND member,, FIND owner,, FIND NEXT within set,, GET,, etc.
Programmers can do optimal navigation through the database.
Conceptual simplicity Data access flexibility Promotes database integrity Data independence Conformance to standards
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
93
Network Data Model (V) ( )
Disadvantages:
Lack of structural independence Navigational and procedural nature of processing System complexity, Database contains a complex array of pointers that thread through a set of records.
Little scope for automated “query optimization”
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
94
Relational Data Model (I) ()
Proposed in 1970 by E.F. Codd (IBM), first commercial system in 1981-82. 1981 82 Now in several commercial products (e.g. DB2, ORACLE MS SQL Server ORACLE, Server, SYBASE SYBASE, INFORMIX) INFORMIX). Several free open source implementations, e.g. MySQL, PostgreSQL Currently most dominant for developing database applications. li ti SQL relational standards: SQL-89 (SQL1), SQL-92 (SQL2) SQL (SQL2), SQL-99, 99 SQL3, SQL3 …
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
95
Relational Data Model (II) ( )
Represented by a collection of tables (row/column)
Tables related by sharing common entity characteristic(s)
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
96
Relational Data Model (III) ( )
Advantages
Structural St t l independence i d d Improved conceptual simplicity E i database Easier d t b design, d i implementation, i l t ti management, t and use Ad hoc query capability with SQL Powerful database management system
Di d t Disadvantages
Substantial hardware and system software overhead P Poor ddesign i andd implementation i l i is i made d easy May promote “islands of information” problems
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
97
Entityy Relationshipp Data Model
Complements the relational data model concepts Represented in an entity relationship diagram (ERD) Based on entities, entities attributes, attributes and relationships
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
98
Entityy Relationshipp Data Model
Advantages
Exceptional E ti l conceptual t l simplicity i li it Visual representation Eff ti communication Effective i ti tool t l Integrated with the relational database model
Di d Disadvantages
Limited constraint representation Limited relationship representation No data manipulation language Loss of information content
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
99
Object-Oriented j Data Model ((I))
Several models have been proposed for implementing in a database system. system One set comprises models of persistent O-O Programming Languages such as C++ (e (e.g., g in OBJECTSTORE or VERSANT), and Smalltalk (e.g., in GEMSTONE). GEMSTONE) Additionally, systems like O2, ORION (at MCC - then ITASCA) IRIS (at H.P.ITASCA), H P - used in Open OODB) OODB). Object Database Standard: ODMG-93, ODMG-version 2 0 ODMG-version 2.0, ODMG version 3.0. 30
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
100
Object-Oriented j Data Model ((II))
Objects or abstractions of real-world entities are stored t d
Attributes describe properties Collection of similar objects is a class
Methods represent real world actions of classes Classes are organized in a class hierarchy
Inheritance is ability of object to inherit attributes and methods of classes above it
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
101
Object-Oriented j Data Model ((III))
Advantages
Adds semantic Add ti content t t Visual presentation includes semantic content D t b Database iintegrity t it Both structural and data independence
Di d Disadvantages
Lack of OODM Complex navigational data access Steep learning curve High system overhead slows transactions
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
102
OO Model vs. ER Model
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
103
Object-Relational j Data Model
Most Recent Trend. Started with Informix U i Universal l Server. S Relational systems incorporate concepts from object databases leading to object-relational. Exemplified in the latest versions of Oracle Oracle-10i 10i, DB2, and SQL Server and other DBMSs. St d d included Standards i l d d in i SQL-99 SQL 99 andd expected t d to t be b enhanced in future SQL standards.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
104
Database Languages g g
Data definition language (DDL)
Permits P it specification ifi ti off data d t types, t structures t t andd any data d t constraints. All specifications are stored in the database. Allows users to describe and name entitles, entitles attributes and relationships required for the application.
Data manipulation language (DML)
General enquiry facility (query language) of the data. Provides basic data manipulation p operations p on data held in the database. Procedural DML - allows user to tell system exactly how to manipulate data. Non-Procedural DML - allows user to state what data is needed rather than how it is to be retrieved retrieved.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
105
Database Languages g g
Fourth Generation Language (4GL)
Query L Q Languages Forms Generators Report Generators Graphics Generators Application Generators
There is no consensus about what constitutes a 4GL. Compared with a 3GL, 3GL which is procedural procedural, a 4 GL is non-procedural. The user defines what is to be done, done not how. how
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
106
DBMS Languages g g ((DDL vs. DML))
Data Definition Language (DDL):
Used by the DBA and database designers to specify the conceptual schema of a database.
In many DBMSs, the DDL is also used to define internal and external schemas (views). ( )
In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas.
SDL is typically realized via DBMS commands provided to the DBA and database designers
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
107
DBMS Languages g g
Data Manipulation Language (DML):
Used to specify database retrievals and updates
DML commands (data sublanguage) can be embedded in a general-purpose programming language (host l language), ) suchh as COBOL, COBOL C C, C++ C++, or JJava.
A library of functions can also be provided to access the DBMS from a programming language
Alternatively, stand-alone DML commands can be applied directly (called a query language).
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
108
Types yp of DML
High Level or Non-procedural Language:
For example, the SQL relational language Are “set”-oriented and specify what data to retrieve rather than how to retrieve it. Also called declarative languages.
Low Level or Procedural Language:
Retrieve data one record record-at-a-time; at a time; Constructs such as looping are needed to retrieve multiple records, records along with positioning pointers. pointers
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
109
DBMS Interfaces
Stand-alone query language interfaces
Example: Entering SQL queries at the DBMS interactive SQL interface (e.g. SQL*Plus in ORACLE)
Programmer interfaces for embedding DML in pprogramming g g languages g g User-friendly interfaces
Menu based forms-based, Menu-based, forms based graphics graphics-based, based etc etc.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
110
DBMS Programming g g Language g g Interfaces
Programmer interfaces for embedding DML in a programming i languages: l
Embedded Approach: e.g embedded SQL (for C, C++, etc.), SQLJ (for Java) Procedure Call Approach: e.g. JDBC for Java, ODBC for other programming languages Database Programming Language Approach: e.g. ORACLE has PL/SQL, a programming language based on SQL; language incorporates SQL and its data types as integral i l components
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
111
Cost considerations for DBMSs
Cost Range: from free open-source systems to configurations costing millions of dollars Examples of free relational DBMSs: MySQL, PostgreSQL, g Q , others Commercial DBMS offer additional specialized modules, e.g. time-series module, spatial data module, document module, XML module
These offer additional specialized functionality when purchased separately Sometimes called cartridges (e.g., in Oracle) or blades
Different licensing options: site license, maximum number of concurrent users (seat license), single user, etc.
ITS322 - DBMSs
Lecture 1: Introduction to DBs and DB Env.
112