Return the paper graph where a paper cites another one iff. the first author of the first paper follows the first author of the second. Binary operator returning just ...
A Join Operator for Property Graphs
G. Bergami1
M. Magnani2
D. Montesi1
6th International Workshop on Querying Graph Structured Data, 2017 1 Department
of Computer Science and Engineering University of Bologna 2 Department
of Information Technology University of Uppsala
1
Key Ideas
Motivation 1
Despite the term “join” appearing in the graph database literature, such operator cannot be used to combine two distinct graphs, as for table joins in the relational model. They currently require to combine two operations: path joins (currently simply called “joins”), for graph traversals. construct to create a graph from the matched paths: SPARQL: CONSTRUCT + FROM Cypher: CREATE + UNION ALL
The resulting query evaluation is quite inefficient. A new operator is required. 2
Graph Joins are not Relational Joins (more like “Graph Products”). 2
Goals
1
The data model must enhance the serialization of both operands and graph result.
2
The join definition must be flexible enough to support further extensions (modularity, compositionality and properties preserving).
3
The physical model must allow a quick access to the data structures.
3
Graph Join: Example 1 (a)
This is an example of a possible graph join query: A Graph Join Query Consider an on-line service such as ResearchGate where researchers can follow each others’ work, and a citation graph . Return the paper graph where a paper cites another one iff. the first author of the first paper follows the first author of the second . Binary operator returning just one graph . Vertex conditions are (θ-)join conditions .
A way to combine the edges is determined.
4
Graph Join: Example 1 (b) {User}
{Follows}
Name=Alice
Name=Bob
{User}
{Follows}
{Follows}
Name=Dan
Name=Carl
{Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{Paper}
{C ite s}
{Paper}
{Cites}
{Follows}
{User}
{User}
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
5
Graph Join: Example 1 (Result) {User}
{Follows}
Name=Alice
{Follows}
{User}
{Follows}
Name=Dan
Name=Carl
{Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
{Cites}
{Follows}
{User}
{User} Name=Bob
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
Graph Join: Example 1 (Result) {User}
{Follows}
Name=Alice
{User} Name=Bob
{Follows}
{User}
{Follows}
Name=Dan
1. Vertex Join
{Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
{Cites}
{Follows}
{User} Name=Carl
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper}
{User,Paper}
Title=Graphs 1Author=Alice Name=Alice
Title=Join 1Author=Alice Name=Alice
Graph Join: Example 1 (Result) {User}
{Follows}
Name=Alice
{User} Name=Bob
{Follows}
{User}
{Follows}
Name=Dan
Name=Carl
{User,Paper} Title=Join 1Author=Alice Name=Alice
1. Vertex Join
{Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
{Cites}
{Follows}
{User}
{User,Paper} Title=Graphs 1Author=Alice Name=Alice
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper} Title=OWL 1Author=Bob Name=Bob
Graph Join: Example 1 (Result) {User}
{Follows}
Name=Alice
{User} Name=Bob
{Follows}
{User}
{Follows}
Name=Dan
Name=Carl
{User,Paper} Title=Join 1Author=Alice Name=Alice
{User,Paper}
1. Vertex Join
Title=Projection 1Author=Carl Name=Carl
{Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
{Cites}
{Follows}
{User}
{User,Paper} Title=Graphs 1Author=Alice Name=Alice
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper} Title=OWL 1Author=Bob Name=Bob
Graph Join: Example 1 (Result) {User}
{Follows}
Name=Alice
{User} Name=Bob
{Follows}
{User}
{Follows}
Name=Dan
Name=Carl
{User,Paper} Title=Join 1Author=Alice Name=Alice
{User,Paper}
1. Vertex Join
Title=Projection 1Author=Carl Name=Carl
{Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
{Cites}
{Follows}
{User}
{User,Paper} Title=Graphs 1Author=Alice Name=Alice
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper}
{User,Paper}
Title=OWL 1Author=Bob Name=Bob
Title=µ-calc 1Author=Dan Name=Dan
Graph Join: Example 1 (Result) {User}
{Follows}
Name=Alice
{Follows}
{User}
{Follows}
Name=Dan
{User,Paper}
{User,Paper}
Title=Graphs 1Author=Alice Name=Alice
Title=Join 1Author=Alice Name=Alice
{User,Paper}
{F oll ow s,C
{Follows,Cites}
2. Combining Edges
ite
s}
Name=Carl
Title=Projection 1Author=Carl Name=Carl
{Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
{Cites}
{Follows}
{User}
{User} Name=Bob
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper}
{User,Paper}
Title=OWL 1Author=Bob Name=Bob
Title=µ-calc 1Author=Dan Name=Dan
6
Graph Join: Example 2 (a)
The previous example showed only one possible way to combine the operands’ edges, but we can even return edges pertaining to both operands as in the following query: For each paper reveal both the direct and the indirect dependencies (either there is a direct paper citation, or one of the authors follows the other one in ResearchGate). Given that we have a disjunction (an edge has to appear in at least one of the operands), we have that the previous result provides a subgraph of the current solution (Disjunctive semantics).
7
Graph Join: Example 2 (Result) {User}
{Follows}
Name=Alice
{Follows}
{User}
{Follows}
Name=Dan
Name=Carl
{User,Paper}
{User,Paper}
Title=Graphs 1Author=Alice Name=Alice
Title=Join 1Author=Alice Name=Alice
{User,Paper} Title=Projection 1Author=Carl Name=Carl {Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
{Cites}
{Follows}
{User}
{User} Name=Bob
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper}
{User,Paper}
Title=OWL 1Author=Bob Name=Bob
Title=µ-calc 1Author=Dan Name=Dan
Graph Join: Example 2 (Result) {Follows}
{Follows}
Name=Dan
Name=Carl
{User,Paper} Title=Join 1Author=Alice Name=Alice
{Follows,Cites}
{User,Paper} Title=Projection 1Author=Carl Name=Carl
s}
{User}
{User,Paper} Title=Graphs 1Author=Alice Name=Alice
llow
{Follows}
{Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
{Cites}
{Follows}
{User}
{User} Name=Bob
{Fo
{User} Name=Alice
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper}
{User,Paper}
Title=OWL 1Author=Bob Name=Bob
Title=µ-calc 1Author=Dan Name=Dan
Graph Join: Example 2 (Result) {Follows}
} ite s
{Follows,Cites}
{User,Paper} Title=Projection 1Author=Carl Name=Carl
s}
Name=Dan
Title=Join 1Author=Alice Name=Alice
,C
{Follows}
Name=Carl
{User,Paper}
{Fo llow
{User}
{User,Paper} Title=Graphs 1Author=Alice Name=Alice
{F oll ow s
{Follows}
{Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
{Cites}
{Follows}
{User}
{User} Name=Bob
{Follows}
{User} Name=Alice
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper}
{User,Paper}
Title=OWL 1Author=Bob Name=Bob
Title=µ-calc 1Author=Dan Name=Dan
Graph Join: Example 2 (Result) {Follows}
} tes
{Follows,Cites}
{User,Paper} Title=Projection 1Author=Carl Name=Carl
s}
Name=Dan
Title=Join 1Author=Alice Name=Alice
,C i
{Follows}
Name=Carl
{User,Paper}
{Fo llow
{User}
{User,Paper} Title=Graphs 1Author=Alice Name=Alice
{F oll ow s
{Follows}
{Cites} {Paper}
{Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
{Cites}
{Follows}
{User}
{User} Name=Bob
{Follows}
{User} Name=Alice
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper}
{User,Paper}
Title=OWL Title=µ-calc 1Author=Bob 1Author=Dan Name=Bob {Follows} Name=Dan
Graph Join: Example 2 (Result) {Follows}
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{C
{Paper}
ite s}
{Paper}
Title=Graphs 1Author=Alice
{Cites}
{Paper}
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
tes
s} ow oll {F
{Cites} {Paper}
s}
{Follows,Cites}
{User,Paper} Title=Projection 1Author=Carl Name=Carl
}
Title=Join 1Author=Alice Name=Alice
,C i
Name=Dan
Name=Carl
{User,Paper}
{Fo llow
{User}
{Follows}
{User,Paper} Title=Graphs 1Author=Alice Name=Alice
{F oll ow s
{Follows}
{Follows}
{User}
{User} Name=Bob
{Follows}
{User} Name=Alice
{User,Paper}
{User,Paper}
Title=OWL Title=µ-calc 1Author=Bob 1Author=Dan Name=Bob {Follows} Name=Dan
Graph Join: Example 2 (Result) {Follows}
{C
ite s}
Title=OWL 1Author=Bob {Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper}
{User,Paper}
Title=OWL Title=µ-calc 1Author=Bob 1Author=Dan Name=Bob {Follows} Name=Dan
}
Title=Join 1Author=Alice
s} te
{Paper}
Title=Graphs 1Author=Alice
s ow oll {F
i {C
{Paper}
{Cites}
}
Title=Projection 1Author=Carl Name=Carl
{Cites} {Paper}
{Paper}
tes
{Follows,Cites}
{User,Paper}
s}
Name=Dan
Title=Join 1Author=Alice Name=Alice
,C i
{Follows}
Name=Carl
{User,Paper}
{Fo llow
{User}
{User,Paper} Title=Graphs 1Author=Alice Name=Alice
{F oll ow s
{Follows}
{Follows}
{User}
{User} Name=Bob
{Follows}
{User} Name=Alice
Graph Join: Example 2 (Result) {Follows}
{C i te s}
{Cites}
} {User,Paper}
Title=OWL Title=µ-calc 1Author=Bob 1Author=Dan Name=Bob {Follows} Name=Dan
}
{Paper}
s ow oll {F
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper}
s}
Title=OWL 1Author=Bob
ite
Title=Join 1Author=Alice
{C
{Paper}
Title=Graphs 1Author=Alice
Title=Projection 1Author=Carl Name=Carl {Cites}
{Paper}
tes
{Follows,Cites}
{Cites} {Paper}
{User,Paper}
s}
Name=Dan
Title=Join 1Author=Alice Name=Alice
,C i
{Follows}
Name=Carl
{User,Paper}
{Fo llow
{User}
{User,Paper} Title=Graphs 1Author=Alice Name=Alice
{F oll ow s
{Follows}
{Follows}
{User}
{User} Name=Bob
{Follows}
{User} Name=Alice
Graph Join: Example 2 (Result) {Follows}
ite s}
{C
{Cites}
} {User,Paper}
}
{Paper}
s ow oll {F
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
{User,Paper}
s}
Title=OWL 1Author=Bob
ite
Title=Join 1Author=Alice
{C
{Paper}
Title=Graphs 1Author=Alice
Title=Projection 1Author=Carl Name=Carl {Cites}
{Paper}
tes
{Follows,Cites}
{Cites} {Paper}
{User,Paper}
s}
Name=Dan
Title=Join 1Author=Alice Name=Alice
,C i
{Follows}
Name=Carl
{User,Paper}
{Fo llow
{User}
{User,Paper} Title=Graphs 1Author=Alice Name=Alice
{F oll ow s
{Follows}
{Follows}
{User}
{User} Name=Bob
{Follows}
{User} Name=Alice
Title=OWL Title=µ-calc 1Author=Bob 1Author=Dan Name=Bob {Follows} Name=Dan
8
Graph Joins vs. Relational Joins
The vertices undergo a relational (θ-)join. Both conjunctive and disjunctive semantics could be expressed as joins, too. ⇒ Therefore, the graph join is not a single relational join, but a combination of several relational joins.
9
Graph Joins vs. Relational Joins: Example (1) Representing a graph as a relational table (similarly to SQLGraph for Gremlin). Property Graph {User}
{Follows}
Name=Alice
Name=Bob
{Follows}
{User}
VResearchGate {Follows}
{Follows}
{User} Name=Carl
Relational Representation
{User}
id Name `v 6 Alice {User} 7 Bob {User} 8 Carl {User} 9 Dan {User}
Name=Dan
EResearchGate id 5 6 7 8
src 6 6 7 9
dst 7 8 9 8
`e {Follows} {Follows} {Follows} {Follows}
10
Graph Joins vs. Relational Joins: Example (2) Representing a graph as a relational table (similarly to SQLGraph for Gremlin). Property Graph
Relational Representation
{Cites} {Paper}
Title=Graphs 1Author=Alice
Title=Join 1Author=Alice
Title=OWL 1Author=Bob
{Paper}
{C
ite
s}
{Paper}
{Cites}
{Paper}
{Paper}
Title=Projection Title=µ-calc 1Author=Carl 1Author=Dan {Cites}
VRef erence id Title Name `v 1 Graphs Alice {Paper} 2 Join Alice {Paper} 3 OWL Bob {Paper} 4 Paperect Carl {Paper} 5 µ-calc Dan {Paper} ERef erence id 1 2 3 4
src 1 2 3 4
dst 3 4 4 5
`e {Cites} {Cites} {Cites} {Cites}
11
Graph Joins vs. Relational Joins: Example (3)
1
Vertex Join VResearchGate id 6 7 8 9
Name Alice Bob Carl Dan
`v {User} {User} {User} {User}
θ θ θ θ
VRef erence id Title Name `v 1 Graphs Alice {Paper} 2 Join Alice {Paper} 3 OWL Bob {Paper} 4 Project Carl {Paper} 5 µ-calc Dan {Paper} ERef erence
EResearchGate id 5 6 7 8
src 6 6 7 9
dst 7 8 9 8
`e {Follows} {Follows} {Follows} {Follows}
id 1 2 3 4
src 1 2 3 4
dst 3 4 4 5
`e {Cites} {Cites} {Cites} {Cites}
12
Graph Joins vs. Relational Joins: Example (4)
2
Combining Edges (Query Plan) VResearchGate ./ VRef erence ./
./
EResearchGate
ERef erence ./
./ VResearchGate ./ VRef erence
Please note that, by incorporating vertices with edges in the join computation, we could enhance the join computation. 13
GCEA Algorithm
Logical Design (1)
The algorithm relies on a specific graph data model, where we make the following assumptions: The graph join result is a not-indexed “view” (as in SQL’s SELECT). The graph result’s information (attribute, values, labels) could be completely reconstructed from essential output informations (ids).
14
Logical Design (2) The following choices allow to easily reconstruct the graph join result’s informations: A graph database is a collection of graphs, as well as a relational database is a collection of tables. Such graph database is represented as a graph with multiple and distinct connected components. Two labelling functions (`v , `e ) and an unique edge-to-vertices association (λ) is defined for the whole graph defining the database. ⇒ Vertices and edges are indexed and stored in multisets. 15
Logical Design: Definitions (1)
Graph θ-Join: Given two graphs Ga = (V, E, Av , Ae ) and Gb = (V 0 , E 0 , A0v , A0e ), a graph θ-join is defined as follows: 0 0 0 Ga ./es θ Gb = (V ./θ V , Ees , Av ∪ Av , Ae ∪ Ae )
where θ is a binary predicate over the vertices and ./θ the θ-join among the vertices, and Ees is a subset of all the possible edges linking the vertices in V ./θ V 0 expressed with the es semantics.
16
Logical Design: Definitions (2)
“es” semantics: Both conjunctive and disjunctive semantics could be represented as relational joins. In particular: Conjunctive: E ./Θ∧ E 0 Disjunctive: E d|>4H >4H
11.29 22.82 22.92 183.90 7 150.74 99 683.91
0.53 0.93 4.35 40.42 411.78 3 966.72
31
Experimental Evaluation: Join Execution Time (2c)
Operands Size L=R (|V |) 10 102 103 104 105 106
Join Time (Java) (ms) Neo4J GCEA (Java) 211.45 222.87 448.97 3 149.90 241 026.79 >4H
24.97 32.70 117.58 1 150.37 17 178.49 178 066.80
32
Summary
Summary
We introduce for the first time a graph join operator for combining 2 graphs. The inefficiency of the graph join evaluation on (graph) query languages justify its definition. Both physical and logical layouts for this model are proposed. Outlook Partition sorted hash join allows the implementation of ≤ predicates. Graph Join properties have to be studied alongside with other graph “algebra” operators. 33
Graph Join Web App
A demo showing the result of a Graph Join for both the conjunctive and disjunctive semantics is available at this website: http://smartdata.cs.unibo.it/graph-join_webapp
34