Lecture 0 Schedule of ITS322 Database Management Systems

11 downloads 792 Views 1MB Size Report
... and C. Begg, Database Systems: A Practical Approach to Design, Implementation, ... R. Elmasri and S. B. Navathe, Fundamentals of Database Systems, 5th ...
ITS322 - Database Management Systems

Semester I/2008

Lecture 0 Schedule of ITS322 Database Management Systems Dr. Thanaruk Theeramunkong Sirindhorn International Institute of Technology Thammasat University T. Connolly, and C. Begg, Database Systems: A Practical Approach to Design, Implementation, and Management, 4th edition, Addison-Wesley, 2004, ISBN: 0-321-21025-5. R. Elmasri and S. B. Navathe, Fundamentals of Database Systems, 5th edition, Pearson, 2007, ISBN: 0-321-41506-X ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

1

Textbooks Database Systems: A Practical Approach t Design, to D i Implementation, I l t ti andd Management By T. B T Connolly, C ll andd C. C Begg, B 4 h edition, 4th di i Addison-Wesley, 2004 ISBN: 0-321-21025-5. 0-321-21025-5

Fundamentals of Database Systems By R. B R Elmasri El i andd S. S B. B Navathe, N th 5th edition, diti Pearson (Addison & Wesley), 2007, ISBN: 00-321-41506-X 321 41506 X

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

2

Office Hours and Gradingg „

„ „

Lecture Time: ‰ Section 1 „ Thursday 9:00-10:20 (BKD2401) „ Friday 13:00-14:20 (BKD2401) ‰ Section S ti 2: 2 „ Thursday 10:40-12:00 (BKD2602) „ Friday 14:40-16:00 14:40 16:00 (BKD2602) Office hours: Thursday (13:00-16:00) & Friday (9:00-12:00) Grading: Quiz/Attendance (10%) Project (20%) Midterm Examination E amination (30%) (31 July J l 2008, 2008 9:00-12:00) 9:00 12:00) Final Examination (40%) (9 October 2008, 9:00-12:00)

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

3

Course Outline No.

Date

1

12, 13 June Introduction to DBMS & SQL Data Manipulation (I)

1, 2, 13

2

19, 20 June Introduction to DBMS & SQL Data Manipulation (II)

1, 2, 14

3

26, 27 June SQL Data Manipulation (III)

14

4 5 6 7 8 9

Topics

3, 4 July Data Definition & Relational Algebra 10 11 July Entity 10, Entity-Relationship Relationship Modeling 17 July Enhanced Entity-Relationship Modeling 24, 25 July Normalization, Review and QA 31 July Midterm Examination 7, 8 August DB Planning, Design and Administration

Chapters

3 5 5 6 1, 2, 3, 5, 6, 13, 14 4

10

14,, 15 August g SQL Q Programming: g g Stored Procedure,, Trigger gg

Extra 24 ((Elmasri))

11

21, 22 August Project presentation

Group work

12

28, 29 August Transaction Processing

17

13

4 5 September 4, S b Q Query P Processing i

18

14

11, 12 September Disk Size Estimation, Disk Storage and Indexing

13, 14 (Elmasri)

15

18, 19 September Project presentation

Group work

16

25, 26 September Review and QA

21

17

ITS322 - DBMSs 9 October

Lecture 1: Introduction to DBs and DB Env. Final Examination

4, 17, 18, 21 4

ITS322 - Database Management Systems

Semester I/2008

Lecture 1 Introduction to Databases Systems

Dr. Thanaruk Theeramunkong Sirindhorn International Institute of Technology Thammasat University R. Elmasri and S. B. Navathe, Fundamentals of Database Systems, 5th edition, Pearson, 2007, ISBN: 0-321-41506-X T. Connolly, and C. Begg, Database Systems: A Practical Approach to Design, Implementation, and Management, 4th edition, Addison-Wesley, 2004, ISBN: 0-321-21025-5. ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

5

Objectives j „ „ „ „ „

„ „

Some common uses of database systems The characteristics of file-based systems The problems with the file file-based based approach The benefits of database approach Th meaning The i off the h terms database, d b database d b systems, database management system (DBMS) The typical functions of a DBMS The advantages and disadvantages of DBMSs

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

6

Objectives j „ „ „

„ „

„

„ „

The major components of the DBMS environment The h personnell involved i l d in i the h DBMS environment i Difference between data administration and database administration d i i i Types of database systems System Catalog and Information Resource Dictionary System (IRDS) Purposes and the origin of the 3-level database architecture Concepts and types of data models Functions and components p of a DBMS

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

7

Data Versus Information „ „ „ „

„

Data constitute building blocks of information Information produced by processing data Information reveals meaning of data Good, timely, relevant information key to decision making Good decision making key to organizational survival

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

8

Where is Database? „

„

„ „

The database (DB) is now such an integral part our day-to-day day to day life that often we are not aware we are using one. E supermarket, Ex: k t credit dit card, d travel t l agent, t library, lib insurance, security systems, university. First applications focused on clerical tasks Requests for information quickly followed File systems developed to address needs ‰ ‰

Data organized according to expected use Data Processing (DP) specialists computerized manual file systems

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

9

Types yp of Databases and DB Applications pp „

Traditional Applications: ‰

„

Numeric and Textual Databases

More Recent Applications: pp ‰ ‰ ‰ ‰ ‰

Multimedia Databases Geographic Information Systems (GIS) Data Warehouses Real time and Active Databases Real-time Many other applications

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

10

File-based Systems y „

„

„ „

„

The file-based system is the predecessor of the d t b database system. t Æ Decentralized D t li d A collection of application programs that perform services for the end users (e.g. reports). Each program defines and manages its own data. data File-based systems were an early attempt to computerize t i the th manuall filing fili system. t The related topics: storage, security, indexing, cross-reference, processing

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

11

Simple p File-based System y

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

12

File-based Processingg

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

13

File-based System y Critique q ((I)) „

File-based System Data Management ‰

‰ ‰ ‰

Requires extensive programming in third-generation language (3GL) Time consuming Makes ad hoc queries impossible Leads to islands of information

Data Field Record File

ITS322 - DBMSs

Raw Facts Group of characters with specific meaning Logically connected fields that describe a person, place, or thing Collection of related records

Lecture 1: Introduction to DBs and DB Env.

14

File-based System y Critique q ((II)) „

Data Dependence ‰ ‰

‰ ‰

„

File structure is defined in the program code. Change in file’s data characteristics requires modification of data access programs Must tell program what to do and how Makes file systems cumbersome from programming and data management views

Structural Dependence ‰

Change in file structure requires modification of related programs

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

15

File-based System y Critique q ((III)) „

Field Definitions and Naming Conventions ‰

‰ ‰ ‰

„

Flexible Fl ibl record d ddefinition fi iti anticipates ti i t reporting ti requirements Selection of proper field names important Attention to length of field names Use of unique record identifiers

Data Redundancy ‰

Different and Diff d conflicting fli i versions i off same data d ‰ Results of uncontrolled data redundancy „ „

Data anomalies D li Data inconsistency

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

16

File-based System y Critique q ((IV)) „

Separation and isolation of data ‰

„

Incompatible file formats ‰

„

Each program maintains its own set of data. Users of one program may be unaware of potentially useful data h ld by held b other th programs. Programs are written in different languages, and so cannot easily access each others files.

Fixed Queries/Proliferation of application programs ‰

Programs are written to satisfy particular functions. Any new requirement needs a new program. program

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

17

Database Approach pp „

Arose because: ‰

‰

„

Definition of data was embedded in application programs, rather than being stored separately and i d independently. d tl No control over access and manipulation of data b beyond d that h imposed i d by b application li i programs.

Result - the database and Database Management System (DBMS).

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

18

Database Management g „

Database is shared, integrated computer structure h i housing: ‰ ‰

„

End user data Metadata

Database Management System (DBMS) ‰ ‰ ‰

Manages Database structure Controls access to data Contains query language

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

19

Database „

„

„

A shared collection of logically related data (and a description d i ti off this thi data), d t ) designed d i d to t meett the th information needs of an organization. System catalog (data dictionary or metadata) pprovides the description p of the data to enable program–data independence. Logically related data comprises entities, entities attributes, and relationships of an organization's i f information. ti

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

20

Database Systems y & DBMS „

Database System ‰ ‰

A system that occupies a database as a basic storage Provides the following advantages over file-based systems „

„

„

Eliminates inconsistency, data anomalies, data dependency, andd structural t t l dependency d d problems bl Stores data structures, relationships, and access paths

D b Database M Management S Systems (DBMS) ‰

A software system that enables users to define, create, and maintain the database and which provides controlled access to this database.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

21

Simplified p Database System y Environment

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

22

Database vs. File Systems y

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

23

DBMS Manages g Interaction

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

24

Database Management System (DBMS)

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

25

Typical yp DBMS Functionality y „

„

„

Define a particular database in terms of its data types, structures and constraints structures, Construct or Load the initial database contents on a secondary storage medium Manipulating the database: ‰ ‰ ‰

„

Retrieval: R t i l Querying, Q i generating ti reports t Modification: Insertions, deletions and updates to its content Accessing the database through Web applications

Processing and Sharing by a set of concurrent users and application programs – yet, yet keeping all data valid and consistent

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

26

Typical yp DBMS Functionality y „

Other features: ‰

‰ ‰ ‰

Protection or Security measures to prevent unauthorized access “Active” processing to take internal actions on data Presentation and Visualization of data Maintaining the database and associated programs over the lifetime of the database application „

Called database, software, and system maintenance

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

27

Functions of a DBMS (I) () „

Data Storage, Retrieval and Update. ‰

„

A User-Accessible Catalog. ‰

„

Must furnish users with the ability to store, retrieve, and update data in the database. Must furnish a catalog g in which descriptions p of data items are stored and which is accessible to users.

Transaction Support ‰

Must furnish a mechanism to ensure that either all the updates corresponding to a given transaction are made or that none of them are made.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

28

Functions of a DBMS (II) ( ) „

Concurrency Control Services ‰

„

Recovery Services ‰

„

Must furnish a mechanism to ensure that database is updated correctly when multiple users are updating the d t b database concurrently. tl Must furnish a mechanism for recovering the database in the event that the database is damaged in any way.

Authorization Services & Security management ‰

Must furnish a mechanism to ensure that only authorized users can access the database.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

29

Functions of a DBMS (III) ( ) „

Support for Data Communication ‰

„

I t it Services Integrity S i & Security Sec rit management ‰

„

Must furnish a means to ensure that both the data in the database and changes to the data follow certain rules. rules

Services to Promote Data Independence ‰

„

Mustt bbe capable M bl off integrating i t ti with ith communication i ti software.

Must include i l d facilities f ili i to support the h independence i d d off programs from the actual structure of the database.

Utilit Services Utility S i ‰

Should provide a set of utility services.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

30

Functions of a DBMS (IV) ( ) „ „ „

„

Data transformation and presentation Backup and recovery management Database language and application programming interfaces A view mechanism. ‰ Provides users with only the data they want or need to use.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

31

Components p of a DBMS

1. Query processor 2. Database manager (DM) 3. File manager 4 DML preprocessor 4. 5. DDL compiler 6. Catalogg manager g ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

32

Components of Database Manager (DM) 1. Authorization control 2. Command processor 3 Integrity 3. I t it checker h k 4. Query optimizer 5 Transaction manager 5. 6. Scheduler 7. Recovery manager 8. Buffer manager

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

33

Advantages g of Using g DB Approach pp (I) () „

Controlling redundancy in data storage and in d l development t andd maintenance i t efforts. ff t ‰

„ „

Restricting unauthorized access to data. Providing persistent storage for program Objects ‰

„

Sharing of data among multiple users.

Object-oriented DBMSs

Providing P idi Storage S S Structures ((e.g. iindexes) d ) for f efficient Query Processing

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

34

Advantages g of Using g DB Approach pp (II) ( ) „ „

„ „ „

Providing backup and recovery services. Providing multiple interfaces to different classes of users. Representing complex relationships among data. Enforcing integrity constraints on the database. database Drawing inferences and actions from the stored data using deductive and active rules

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

35

Additional Implications p of DB Approach pp (I) () „

Potential for enforcing standards: ‰

‰

„

This is very crucial for the success of database applications in large organizations. Standards refer to data item names, display formats, screens, report structures, meta-data (description of d ) W data), Webb page layouts, l etc.

Reduced application development time: ‰

Incremental time to add each new application is reduced.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

36

Additional Implications p of DB Approach pp (II) ( ) „

Flexibility to change data structures: ‰

„

Availability of current information: ‰

„

Database structure may evolve as new requirements are defined. Extremelyy important p for on-line transaction systems y such as airline, hotel, car reservations.

Economies of scale: ‰

Wasteful overlap of resources and personnel can be avoided by consolidating data and applications across departments.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

37

Disadvantages g of DBMS „ „ „ „ „ „ „

Complexity Size i Cost of DBMS Additional hardware costs Cost of conversion Performance Higher impact of a failure

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

38

When not to use a DBMS „

Main inhibitors (costs) of using a DBMS: ‰ ‰

„

High g initial investment and possible p need for additional hardware. Overhead for providing generality, security, concurrency control, recovery, and integrity functions.

When a DBMS may be unnecessary: ‰

‰

‰ ‰

‰

If the database and applications are simple, well defined, and not expected to change. If there are stringent real-time requirements that may not be met because of DBMS overhead. overhead If access to data by multiple users is not required. If the database system is not able to handle the complexity of data because of modeling limitations If the DB users need special operations not supported by the DBMS.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

39

Example p of a DB „

Some mini-world relationships: ‰ ‰ ‰ ‰ ‰ ‰

„

SECTIONs are off specific SECTION ifi COURSEs COURSE STUDENTs take SECTIONs COURSEs have prerequisite COURSEs INSTRUCTORs teach SECTIONs COURSEs are offered by DEPARTMENTs STUDENTs major in DEPARTMENTs

Note: The above entities and relationships are typically expressed in a conceptual data model, model such as the ENTITY-RELATIONSHIP data model (learn more later)

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

40

A Simple p Database (I) ()

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

41

A Simple p Database (II) ( )

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

42

Main Characteristics of the DB Approach pp „

Self-describing nature of a database system: ‰

‰ ‰

„

A DBMS catalog l stores the h ddescription i i off a particular i l database (e.g. data structures, types, and constraints) The description is called meta-data. meta data This allows the DBMS software to work with different database applications applications.

Insulation between programs and data: ‰ ‰

Called C ll d program-data d t independence. i d d Allows changing data structures and storage organization without having to change the DBMS access programs programs.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

43

A Simplified Database Catalog

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

44

Main Characteristics of DB Approach pp (I) () „

Data Abstraction: ‰

‰

„

A data model is used to hide storage details and present the users with a conceptual view of the database. Programs g refer to the data model constructs rather than data storage details

Support of multiple views of the data: ‰

Each user may see a different view of the database, which hi h ddescribes ib only l the th data d t off interest i t t to t that th t user.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

45

Main Characteristics of DB Approach pp (II) ( ) „

Sharing of data and multi-user transaction processing: i ‰

‰

‰

‰

Allowing a set of concurrent users to retrieve from and to update the database. Concurrency control within the DBMS guarantees that each transaction i is i correctly l executedd or aborted b d Recovery subsystem ensures each completed transaction has it effect its ff t permanently tl recorded d d in i the th database d t b OLTP (Online Transaction Processing) is a major part of database applications applications. This allows hundreds of concurrent transactions to execute per second.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

46

DBMS Environment (Major components)

„

„

„

„

„

Hardware ‰ Can C range ffrom a PC tto a network t k off computers. t Software ‰ DBMS, operating system, network software (if necessary) and also the application programs. Data ‰ Used by the organization and a description of this data called the schema. Procedures ‰ Instructions and rules that should be applied to the design and use of the database and DBMS. People

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

47

Database Users „

Users may be divided into ‰ Actors on the Scene „

Those who actually use and control the database content, g , developp and maintain database and those who design, applications. ‰ ‰ ‰ ‰ ‰

‰

Data Administrator (DA) Database Administrator (DBA) Database Designers (Logical and Physical) Application Programmers End Users (native and sophisticated)

Workers Behind the Scene „

Those who design and develop the DBMS software and related tools, tools and the computer systems operators. operators

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

48

Users in DB System Environment

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

49

Data Administration vs. Database Administration „

Data Administration ‰

„

The management of the data resource, which includes database planning, development and maintenance of standards, d d policies li i andd procedures, d andd conceptuall andd logical database design.

D b Database Ad Administration i i i ‰

The management of the physical realization of a database application, which includes physical database design and implementation, setting security and integrity controls, monitoring system performance, and reorganizing the database, as necessary.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

50

Data Administration Tasks

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

51

Database Administration Tasks

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

52

DA and DBA – Main Task Differences.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

53

Database System y Types yp „

Single-user vs. Multi-user Database ‰ ‰ ‰

„ „

Centralized vs. Distributed Usage g Purpose p ‰ ‰

„

Desktop Workgroup Enterprise

Production or transactional Decision support or data warehouse

Multi-user DBMS Architecture ‰ ‰ ‰

Teleprocessing Fil File-server Client-server

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

54

Teleprocessing p g „ „ „

Traditional architecture. Single mainframe with a number of terminals Trend is now towards downsizing.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

55

File-server „

„ „ „

File-server is connected to several workstations across a network. t k Database resides on file-server. DBMS and applications run on each workstation. Disadvantages include: ‰ ‰ ‰

Significant network traffic. Copy of DBMS on each workstation. Concurrency, recovery and integrity control more complex.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

56

File-server Architecture

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

57

Client-server „ „

„

Server holds the database and the DBMS. Client manages the user interface and runs applications. Advantages include: ‰ ‰ ‰ ‰ ‰

Wider access to existing databases. databases Increased performance. P ibl reduction Possible d ti in i hardware h d costs. t Reduction in communication costs. Increased consistency.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

58

Client-server Architecture

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

59

Alternative Client-server Topologies

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

60

Summary of Client-server Functions

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

61

DBMS Server „

„

„

Provides database query and transaction services to the clients Relational DBMS servers are often called SQL servers, queryy servers,, or transaction servers q Applications running on clients utilize an Application Program Interface (API) to access server databases via standard interface such as: ‰ ‰

„

„

ODBC: Open Database Connectivity standard JDBC ffor JJava programming JDBC: i access

Client and server must install appropriate client module and server module software for ODBC or JDBC See Chapter 9

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

62

Two Tier Client-Server Architecture „

„

„

A client program may connect to several DBMSs, sometimes ti called ll d the th data d t sources. In general, data sources can be files or other nonDBMS software that manages data. Other variations of clients are possible: e.g., e g in some object DBMSs, more functionality is transferred to clients including data dictionary functions, optimization and recovery across multiple lti l servers, etc. t

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

63

Three Tier Client-Server Architecture „ „

Common for Web applications I t Intermediate di t L Layer called ll d Application A li ti Server S or Web W b Server: ‰

‰

„

Stores the St th webb connectivity ti it software ft andd the th business b i logic l i partt of the application used to access the corresponding data from the database server Acts like a conduit for sending partially processed data between the database server and the client.

Three-tier Architecture Can Enhance Security: ‰ ‰

Database server only accessible via middle tier Clients cannot directly access database server

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

64

Three-tier client-server architecture

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

65

System y Catalogg „

„

A repository of information (metadata) describing th data the d t in i the th database. d t b Typically stores: ‰ ‰ ‰ ‰

„

Names of authorized users. Names of data items in the database. Constraints on each data item. Data items accessible byy a user and the type yp of access.

It is used by modules such as: ‰ ‰

Authorization Control. Control Integrity Checker.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

66

Information Resource Dictionary System (IRDS) „

„

„

„

Response to an attempt to standardize data dictionary interfaces. interfaces An IRDS is a software tool that can be used to control and document an organization’s organization s information resources. resources It provides a definition for the tables that comprise the data dictionary and the operations that can be used to access these tables. Obj ti Objectives: ‰ ‰ ‰

Extensibility of data Integrity of data Controlled access to data

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

67

IRDS Services Interface

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

68

Three-Level Architecture of a DB system (Objective) „ „ „

„

„

„

All users should be able to access same data. A user's ' view i iis iimmune to t changes h made d in i other th views. i Users should not need to know physical database storage d il details. DBA should be able to change database storage structures without ih affecting ff i the h users'' views. i Internal structure of database should be unaffected by changes to physical aspects of storage. DBA should be able to change conceptual structure of database without affecting all users.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

69

ANSI-SPARC Three-level Architecture

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

70

ANSI-SPARC Three-level Architecture „

External Level ‰

„

Conceptual Level ‰

„

Users' view of the database. Describes that part of database that is relevant to a particular user. Community view of the database database. Describes what data is stored in database and relationships among the data.

I t Internal l Level L l ‰

Physical representation of the database on the computer. Describes how the data is stored in the database.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

71

Difference between Three levels

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

72

Data Independence and the ANSI-SPARC 3 l l Architecture 3-level A hit t

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

73

Data Independence p „

Logical Data Independence ‰

‰ ‰

„

The capacity Th it to t change h the th conceptual t l schema h without ith t having h i to t change the external schemas and their associated application p g programs. Conceptual schema changes e.g. addition/removal of entities. Should not require changes to external schema or rewrites of application programs.

Physical Data Independence ‰

‰

‰

The capacity to change the internal schema without having to change the conceptual schema. Internal schema changes e.g. e g using sing different file organizations, organi ations storage structures/devices. Should not require change to conceptual or external schemas.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

74

Historical Development p of DB Tech. ((I)) „

Early Database Applications: ‰

‰

„

The Hierarchical Th Hi hi l andd Network N t k Models M d l were introduced i t d d in i mid id 1960s and dominated during the seventies. A bulk of the worldwide database processing still occurs using these models, particularly, the hierarchical model.

Relational Model based Systems: y ‰

‰

Relational model was originally introduced in 1970, was heavily researched and experimented within IBM Research and several universities. i ii Relational DBMS Products emerged in the early 1980s.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

75

Historical Development p of DB Tech. ((II)) „

Object-oriented and emerging applications: ‰

Object-Oriented Obj t O i t d Database D t b M Managementt S Systems t (OODBMS (OODBMSs)) were introduced in late 1980s and early 1990s to cater to the need of complex p data pprocessing g in CAD and other applications. pp „

‰

‰

Their use has not taken off much.

Many relational DBMSs have incorporated object database concepts, leading l di to a new category called ll d object-relational b l l DBMSs (ORDBMSs) Extended relational systems add further capabilities (e.g. (e g for multimedia data, XML, and other data types)

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

76

Historical Development p of DB Tech. ((III)) „

Data on the Web and E-commerce Applications: ‰

‰

‰

‰

Web W b contains t i data d t in i HTML (Hypertext (H t t markup k language) with links among pages. This has given rise to a new set of applications and EE commerce is using new standards like XML (eXtended Markup p Language). g g ) Script programming languages such as PHP and JavaScript allow generation of dynamic Web pages that are partially generated from a database. Also allow database updates through Web pages

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

77

Extendingg Database Capabilities p „

New functionality is being added to DBMSs in the following areas: ‰ Scientific Applications pp ‰ XML (eXtensible Markup Language) ‰ Image Storage and Management ‰ Audio and Video Data Management ‰ Data Warehousing and Data Mining ‰ Spatial Data Management ‰ Time Series and Historical Data Management

„

The above gives rise to new research and development in incorporating new data types, complex data structures, new operations and storage and indexing schemes in database systems.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

78

Data Models (I) () „

„

Data Model: ‰ A set of concepts to describe the structure of a database, database the operations for manipulating these structures, and certain constraints that the database should obey. Data Model Structure and Constraints: ‰ Constructs are used to define the database structure ‰ Constructs typically include elements (and their data types) as well as ggroups p of elements ((e.g. g entity, y, record,, table), ), and relationships among such groups ‰ Constraints specify some restrictions on valid data; these constraints must be enforced at all times

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

79

Data Models (II) ( ) „

„

Collection of concepts for describing data, relationships between data and constraints on the data in an organization. Data Model comprises: ‰

A structural part „

‰

A manipulative part „

‰

Consisting of a set of rules according to which databases can be constructed. Defining the types of operations that are allowed on the data (update/retrieving data from the DB or changing the DB structure).

Possibly a set of integrity rules „

Ensuring that the data is accurate.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

80

Data Models (III) ( ) „

Data Model Operations: ‰

‰

These operations are used for specifying database retrievals and updates by referring to the constructs of th data the d t model. d l Operations on the data model may include basic model operations i ( (e.g. generic i insert, i delete, d l update) d ) andd userdefined operations (e.g. compute_student_gpa, update inventory) update_inventory)

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

81

Data Models ((IV)) – Levels of Data Models „

Conceptual (high-level, semantic) data models: ‰

Provide concepts that are close to the way users perceive data. data „

„

Physical (low (low-level, level, internal) data models: ‰

‰

„

(Also called entity-based or object-based data models.)

Provide concepts that describe details of how data is stored in the computer. These are usually specified in an ad-hoc manner through DBMS design and administration manuals

Implementation (representational, logical) data models: ‰

Provide concepts that fall between the above two, used by many commercial i l DBMS implementations i l t ti (e.g. ( relational l ti l data d t models d l used in many commercial systems).

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

82

Data Models ((V)) – Types yp of Data Models „

Types of Data Models ‰ ‰ ‰

„

Record-based R d b d Data D M Models d l Object-based Data Models Physical Data Models

The first two models are used to describe data at the conceptual and Logical levels, the latter is for the internal level.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

83

Data Models (VI) ( ) „

Record-based Data Models ‰ ‰ ‰

„

Object based Data Models Object-based ‰ ‰ ‰

„

Hierarchical Data Model Network Data Model Relational Data Model Entity-Relationship Object-Oriented j Semantic or Functional

Physical Data Models ‰

‰

Physical Ph i l data d t models d l describe d ib how h data d t is i stored t d in i the th computer, t representing information such as record structures, record orderings, and access paths Th are nott as many physical There h i l data d t models d l as logical l i l data d t models, the most common ones being the unifying model and the frame memory

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

84

Implementation p Data Models

1st generation

2nd generation

3nd generation ti

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

85

Hierarchical Data Model (I) () „

„

„

„

Initially implemented in a joint effort by IBM and N th American North A i Rockwell R k ll aroundd 1965. 1965 Resulted R lt d in i the IMS family of systems. IBM’s IMS product had (and still has) a very large customer base worldwide Hierarchical model was formalized based on the IMS system Other systems based on this model: System 2k (SAS inc.))

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

86

Hierarchical Data Model (II) ( ) „

Logically represented by an upside down tree ‰ ‰

Eachh parentt can have E h many children hild Each child has only one parent

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

87

Hierarchical Data Model (III) ( ) „

Advantages: ‰ ‰

‰

Conceptual simplicity simplicity, simple to construct and operate Corresponds to a number of natural hierarchically organized domains, domains ee.g., g organization ((“org”) org ) chart Language is simple: „

‰

Uses constructs U t t lik like GET, GET GET UNIQUE, UNIQUE GET NEXT, NEXT GET NEXT WITHIN PARENT, etc.

Database security and integrity, integrity Data independence, independence Efficiency

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

88

Hierarchical Data Model (IV) ( ) „

Disadvantages: ‰ ‰

‰ ‰

‰ ‰ ‰

Navigational and procedural nature of processing Database is visualized as a linear arrangement of records Little scope for "query optimization“ C Complex l implementation, i l i programming i andd use complexity Difficult iffi l to manage andd lack l k off standards d d Lacks structural independence Implementation limitations

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

89

Network Data Models (I) () „

„

„

The first network DBMS was implemented by H Honeywell ll in i 1964-65 1964 65 (IDS S System). t ) Adopted heavily due to the support by CODASYL (Conference on Data Systems Languages) (CODASYL - DBTG report of 1971). Later implemented in a large variety of systems IDMS (Cullinet - now Computer Associates), DMS 1100 (Unisys), IMAGE (H.P. (Hewlett-Packard)), VAX -DBMS (Digital Equipment Corp., next COMPAQ, now H.P.).

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

90

Network Data Model (II) ( ) „

Each record can have multiple parents ‰ ‰ ‰

Composed of sets Each set has owner record and member record Member may have several owners

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

91

Network Data Model ((III)) – Another Example p

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

92

Network Data Model (IV) ( ) „

Advantages: ‰

‰

‰

Able to model complex relationships and represents semantics of add/delete on the relationships. Can handle most situations for modeling using record types and relationship l i hi types. Language is navigational; uses constructs like FIND, FIND member,, FIND owner,, FIND NEXT within set,, GET,, etc. „

‰ ‰ ‰ ‰ ‰

Programmers can do optimal navigation through the database.

Conceptual simplicity Data access flexibility Promotes database integrity Data independence Conformance to standards

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

93

Network Data Model (V) ( ) „

Disadvantages: ‰ ‰ ‰

Lack of structural independence Navigational and procedural nature of processing System complexity, Database contains a complex array of pointers that thread through a set of records. „

Little scope for automated “query optimization”

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

94

Relational Data Model (I) () „

„

„

„

„

Proposed in 1970 by E.F. Codd (IBM), first commercial system in 1981-82. 1981 82 Now in several commercial products (e.g. DB2, ORACLE MS SQL Server ORACLE, Server, SYBASE SYBASE, INFORMIX) INFORMIX). Several free open source implementations, e.g. MySQL, PostgreSQL Currently most dominant for developing database applications. li ti SQL relational standards: SQL-89 (SQL1), SQL-92 (SQL2) SQL (SQL2), SQL-99, 99 SQL3, SQL3 …

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

95

Relational Data Model (II) ( ) „

Represented by a collection of tables (row/column) ‰

Tables related by sharing common entity characteristic(s)

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

96

Relational Data Model (III) ( ) „

Advantages ‰ ‰ ‰

‰ ‰

„

Structural St t l independence i d d Improved conceptual simplicity E i database Easier d t b design, d i implementation, i l t ti management, t and use Ad hoc query capability with SQL Powerful database management system

Di d t Disadvantages ‰ ‰ ‰

Substantial hardware and system software overhead P Poor ddesign i andd implementation i l i is i made d easy May promote “islands of information” problems

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

97

Entityy Relationshipp Data Model „ „ „

Complements the relational data model concepts Represented in an entity relationship diagram (ERD) Based on entities, entities attributes, attributes and relationships

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

98

Entityy Relationshipp Data Model „

Advantages ‰ ‰ ‰ ‰

„

Exceptional E ti l conceptual t l simplicity i li it Visual representation Eff ti communication Effective i ti tool t l Integrated with the relational database model

Di d Disadvantages ‰ ‰ ‰ ‰

Limited constraint representation Limited relationship representation No data manipulation language Loss of information content

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

99

Object-Oriented j Data Model ((I)) „

„

„

„

Several models have been proposed for implementing in a database system. system One set comprises models of persistent O-O Programming Languages such as C++ (e (e.g., g in OBJECTSTORE or VERSANT), and Smalltalk (e.g., in GEMSTONE). GEMSTONE) Additionally, systems like O2, ORION (at MCC - then ITASCA) IRIS (at H.P.ITASCA), H P - used in Open OODB) OODB). Object Database Standard: ODMG-93, ODMG-version 2 0 ODMG-version 2.0, ODMG version 3.0. 30

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

100

Object-Oriented j Data Model ((II)) „

Objects or abstractions of real-world entities are stored t d ‰ ‰

Attributes describe properties Collection of similar objects is a class „ „

‰

Methods represent real world actions of classes Classes are organized in a class hierarchy

Inheritance is ability of object to inherit attributes and methods of classes above it

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

101

Object-Oriented j Data Model ((III)) „

Advantages ‰ ‰ ‰ ‰

„

Adds semantic Add ti content t t Visual presentation includes semantic content D t b Database iintegrity t it Both structural and data independence

Di d Disadvantages ‰ ‰ ‰ ‰

Lack of OODM Complex navigational data access Steep learning curve High system overhead slows transactions

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

102

OO Model vs. ER Model

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

103

Object-Relational j Data Model „

„

„

„

Most Recent Trend. Started with Informix U i Universal l Server. S Relational systems incorporate concepts from object databases leading to object-relational. Exemplified in the latest versions of Oracle Oracle-10i 10i, DB2, and SQL Server and other DBMSs. St d d included Standards i l d d in i SQL-99 SQL 99 andd expected t d to t be b enhanced in future SQL standards.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

104

Database Languages g g „

Data definition language (DDL) ‰

‰

„

Permits P it specification ifi ti off data d t types, t structures t t andd any data d t constraints. All specifications are stored in the database. Allows users to describe and name entitles, entitles attributes and relationships required for the application.

Data manipulation language (DML) ‰ ‰

‰

‰

General enquiry facility (query language) of the data. Provides basic data manipulation p operations p on data held in the database. Procedural DML - allows user to tell system exactly how to manipulate data. Non-Procedural DML - allows user to state what data is needed rather than how it is to be retrieved retrieved.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

105

Database Languages g g „

Fourth Generation Language (4GL) ‰ ‰ ‰ ‰ ‰

„ „

„

Query L Q Languages Forms Generators Report Generators Graphics Generators Application Generators

There is no consensus about what constitutes a 4GL. Compared with a 3GL, 3GL which is procedural procedural, a 4 GL is non-procedural. The user defines what is to be done, done not how. how

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

106

DBMS Languages g g ((DDL vs. DML)) „

Data Definition Language (DDL): ‰

Used by the DBA and database designers to specify the conceptual schema of a database.

‰

In many DBMSs, the DDL is also used to define internal and external schemas (views). ( )

‰

In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas. „

SDL is typically realized via DBMS commands provided to the DBA and database designers

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

107

DBMS Languages g g „

Data Manipulation Language (DML): ‰

Used to specify database retrievals and updates

‰

DML commands (data sublanguage) can be embedded in a general-purpose programming language (host l language), ) suchh as COBOL, COBOL C C, C++ C++, or JJava. „

‰

A library of functions can also be provided to access the DBMS from a programming language

Alternatively, stand-alone DML commands can be applied directly (called a query language).

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

108

Types yp of DML „

High Level or Non-procedural Language: ‰ ‰

‰

„

For example, the SQL relational language Are “set”-oriented and specify what data to retrieve rather than how to retrieve it. Also called declarative languages.

Low Level or Procedural Language: ‰ ‰

Retrieve data one record record-at-a-time; at a time; Constructs such as looping are needed to retrieve multiple records, records along with positioning pointers. pointers

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

109

DBMS Interfaces „

Stand-alone query language interfaces ‰

„

„

Example: Entering SQL queries at the DBMS interactive SQL interface (e.g. SQL*Plus in ORACLE)

Programmer interfaces for embedding DML in pprogramming g g languages g g User-friendly interfaces ‰

Menu based forms-based, Menu-based, forms based graphics graphics-based, based etc etc.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

110

DBMS Programming g g Language g g Interfaces „

Programmer interfaces for embedding DML in a programming i languages: l ‰

‰

‰

Embedded Approach: e.g embedded SQL (for C, C++, etc.), SQLJ (for Java) Procedure Call Approach: e.g. JDBC for Java, ODBC for other programming languages Database Programming Language Approach: e.g. ORACLE has PL/SQL, a programming language based on SQL; language incorporates SQL and its data types as integral i l components

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

111

Cost considerations for DBMSs „

„

„

Cost Range: from free open-source systems to configurations costing millions of dollars Examples of free relational DBMSs: MySQL, PostgreSQL, g Q , others Commercial DBMS offer additional specialized modules, e.g. time-series module, spatial data module, document module, XML module ‰

‰

„

These offer additional specialized functionality when purchased separately Sometimes called cartridges (e.g., in Oracle) or blades

Different licensing options: site license, maximum number of concurrent users (seat license), single user, etc.

ITS322 - DBMSs

Lecture 1: Introduction to DBs and DB Env.

112