Generating Synthetic Database Schem as for Sim ... - Google Sites

0 downloads 98 Views 4MB Size Report
M otivation. • PDMS sim ulators need to associate a local database schem a to each peer in the sim ulated netw ork. â€




!

$

% !

"

$

#

% &$

!

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

2



$ % $

• )%

% '

(

%

% $



' – /

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

%

%

! *%

+,,' ' ' -

% '

- !,.

%

3

• )

$

! 0 1

$ • )%

%

– $

! % %

% $

!

"

2% '

%

!# –

( !% %

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

% '

! (

% $

4



$ – –)

% !

$

% 0

$ 1

%$

• – $

%

! %

! $

$

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

5

Parameters Synthetic Schema (S1)

Schema Generator Synthetic Schema Extractor

Synthetic Schema (S2)

Synthetic Schema Modifier n

... Synthetic Schema (Sj)

Base Schema (S)

Domain Ontology

• • • 3

% +

% %

! 1 $

%

1 %

1

!

0

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

6

Base Schema: IMDb

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

7

! Empty Synthetic Schema

Candidate Elements = E

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

1-size Synthetic Schema

Candidate Elements = {movies, country}

8

Base Schema: IMDb

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

9

! Empty Synthetic Schema

Candidate Elements = E

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

1-size Synthetic Schema

2-size Synthetic Schema

Candidate Elements = {movies, country}

Candidate Elements = {distributor2country, country2sfx, certificate, prodcompany2country, releasein, located, shotin, location, movies, country}

10

Base Schema: IMDb

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

11

! Empty Synthetic Schema

1-size Synthetic Schema

2-size Synthetic Schema

Candidate Elements = {movies, country}

Candidate Elements = {distributor2country, country2sfx, certificate, prodcompany2country, releasein, located, shotin, location, movies, country}

Candidate Elements = E

3-size Synthetic Schema

Candidate Elements = {distributor2country, country2sfx, certificate, prodcompany2country, releasein, located, shotin, location, movies, country} Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

12

Base Schema: IMDb

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

13

! Empty Synthetic Schema

1-size Synthetic Schema

Candidate Elements = E

Candidate Elements = {movies, country}

3-size Synthetic Schema

Candidate Elements = {distributor2country, country2sfx, certificate, prodcompany2country, releasein, located, shotin, location, movies, country} Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

2-size Synthetic Schema

Candidate Elements = {distributor2country, country2sfx, certificate, prodcompany2country, releasein, located, shotin, location, movies, country} 4-size Synthetic Schema

14

• Modifying operations – Removal of properties – Insertion of new properties – Replacement of element label – Replacement of property data type

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

15

" 4-size Synthetic Schema

MOVIES Castcoverage Crewcoverage Runtimes

4-size Synthetic Schema

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

16

# Movie Ontology

4-size Synthetic Schema

MOVIES

Stars Synopsis

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

Title, Year, Stars, ColorType, Synopsis, Position, Runtimes, Opinion, Duration, Trailer, MovieURL, Episode, Credits...

MOVIES ≡ FILMS MOVIES ≡ MOVINGPICTURE

17

"

$ Movie Ontology

4-size Synthetic Schema

MOVIES

MOVIES ≡ FILMS MOVIES ≡ MOVINGPICTURE

FILMS

TELEFILMS ⊆ MOVIES MOVIES ⊆ PRODUCTION

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

18

"

%

4-size Synthetic Schema

MOVIES

VOTES NUMBER

NUMBER ⊆ NUMERIC DECIMAL ⊆ NUMERIC INTEGER ⊆ NUMERIC Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

INTEGER 19

(a) Generated Schema

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

(b) Modified Schema

20

&

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

21

& Duration (6 Elements) Elapsed Time (in sec.)

Duration (4 Elements)

Elapsed Time (in sec.)

50,00 40,00 30,00 20,00 10,00 0,00 1

2

3

4

50,00 40,00 30,00 20,00 10,00 0,00 1

2

#Schema Modifiers 10 Schemas

20 Schemas

3

4

#Schema Modifiers 10 Schemas

30 Schemas

20 Schemas

30 Schemas

Elapsed Time (in sec.)

Duration (8 Elements) 50,00 40,00 30,00 20,00 10,00 0,00 1

2

3

4

#Schema Modifiers 10 Schemas

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

20 Schemas

30 Schemas

22

Global Schema Similarity

& 1,0 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0,0

1

2

3

4

4-size Schemas

0,85

0,83

0,83

0,78

6-size Schemas

0,89

0,86

0,85

0,81

8-size Schemas

0,88

0,88

0,86

0,83

#Schema Modifiers 4-size Schemas

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

6-size Schemas

8-size Schemas

23

"

'

(

• No similar work on generating synthetic database schemas in the literature • Found only tools that produce random data to populate empty schemas for performance testing purposes on databases • These tools can be used to populate the synthetic schemas Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

24

• Automatic process to generate multiple synthetic database schemas • Used in a semantic peer clustering process • Ongoing research issues – Produce synthetic queries to be executed at the synthetic schemas – Generate synthetic schemas considering multiple base schemas – Develop new schema modifiers, e.g. introduce noise in the synthetic schemas Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

25

)%

(

$

Pires, Vieira, Saraiva, Barbosa @ DEXA 2011

45

Suggest Documents