T he U niv ersity o f Texas at A ustin. IS. EC. 2. 0. 1. 0. 2. RPC ? Web. Service. SQL ..... batch ( Service directory : service ) { for ( File f .... Item a = aws.
ISEC 2010 The University of Texas at Austin
Unifying Remote Data, Procedures and Services William R. Cook University of Texas at Austin with Eli Tilevich, Yang Jiao Virginia Tech Ali Ibrahim, Ben Wiedermann UT Austin
The University of Texas at Austin
ISEC 2010
C P R SQL
? Web Service 2
ISEC 2010 The University of Texas at Austin
Efficient, General Domain-Specific Language
C P R Ease of programming
SQL
?
Web Service Orientation: Service Stateless Cross-platform
3
ISEC 2010 The University of Texas at Austin
Standard Practice... 1. Design a programming language 2. Implement distribution as a library RPC library, stub generator SQL library Web Service library, wrapper generator
Deep assumption: libraries are enough! 4
The University of Texas at Austin
ISEC 2010
A B
B'
A'
design is chaotic 5
The University of Texas at Austin
ISEC 2010
Remote Procedure proc(data) Call
6
s
20 yea r
The University of Texas at Austin
ISEC 2010
proc(data)
A B R O C M O DC I RM
Distributed Objects 7
ISEC 2010
Benefits Use existing languages Elegant OO model
The University of Texas at Austin
Problems Latency (many round trips) Stateful communication Platform-specific
Solutions? Remote Façade Data Transfer Object Web services? (more on that later) 8
ISEC 2010 The University of Texas at Austin
print( r.Name ); print( r.Size );
Another Start...
?
9
ISEC 2010
Another start... Motivating example:
The University of Texas at Austin
print( r.Name ); print( r.Size );
10
ISEC 2010
Another start... Starting point
The University of Texas at Austin
print( r.Name ); print( r.Size );
Notes: print is local r is remote
11
ISEC 2010
Language Design Problem Starting point
The University of Texas at Austin
print( r.Name ); print( r.Size );
Notes: print is local r is remote
Goals Fast: one round trip Sateless communication Platform independent Clean programming model (service oriented + objects)
12
ISEC 2010 The University of Texas at Austin
A Novel Solution: Batches batch ( Item r : service ) { print( r.Name ); print( r.Size ); }
13
ISEC 2010 The University of Texas at Austin
A Novel Solution: Batches batch ( Item r : service ) { print( r.Name ); print( r.Size ); }
Execution model: Batch Execution Pattern 1. Client sends script to the server (Creates Remote Façade on the fly)
2. Server executes two calls 3. Server returns results in bulk (name, size) (Creates Data Transfer Objects on the fly)
4. Client runs the local code (print statements) 14
ISEC 2010 The University of Texas at Austin
Compiled Client Code (generated) // create remote script script = ; // execute on the server Forest x = service.execute( script ); // Client uses the results print( x.get(“A”) ); print( x.getInt(“B”) );
15
ISEC 2010 The University of Texas at Austin
A larger example int limit = ...; Service serviceConnection =...; batch ( Mailer mail : serviceConnection ) { for ( Message msg : mail.Messages ) if ( msg.Size > limit ) { print( msg.Subject & “ Deleted” ); msg.delete(); } else print( msg.Subject & “:” & msg.Date ); } 16
ISEC 2010 The University of Texas at Austin
A larger example int limit = ...; Service serviceConnection =...; batch ( Mailer mail : serviceConnection ) { for ( Message msg : mail.Messages ) if ( msg.Size > limit ) { Local to remote print( msg.Subject & “ Deleted” ); msg.delete(); } Method call else print( msg.Subject & “:” & msg.Date ); } Remote to local 17
ISEC 2010 The University of Texas at Austin
Remote part as Batch Script script = inX ) ) { msg.delete(); } else { outC( msg.Date ); } ]>
18
ISEC 2010 The University of Texas at Austin
Remote part as Batch Script Root object of service
script = inX ) ) { msg.delete(); } else { Output named “C” outC( msg.Date ); } Not Java: its “batch script” ]>
19
ISEC 2010 The University of Texas at Austin
Forest: Trees of Basic Values Objects “Product Update” 10/22/09 100k ... “Status Report” 12/2/09 8k ... “Happy Valentine” 2/10/10 112k ... “Happy Hour” 2/26/10 3k ... “Promotion” 3/1/10 1k ...
Input Forest
A 10k
Output Forest msg A
B
“Product Update” “Status Report” “Happy Valentine” “Happy Hour” “Promotion”
true false true false false
C 12/2/09 2/26/10 3/1/10 20
ISEC 2010 The University of Texas at Austin
Compiled code (auto generated) Service serviceConnection =...; in = new Forest(); in.put(“X”, limit); Forest result = serviceConnection.execute(script, in); for ( r : result.getIteration(“msg”) ) if ( r.getBoolean(“B”) ) print( r.get(“A”) & “ Deleted” ); else print( r.get(“A”) & “:” & r.get(“C”) );
21
ISEC 2010 The University of Texas at Austin
Compiled code (auto generated) Service serviceConnection =...; in = new Forest(); Send in.put(“X”, limit); inputs Forest result = serviceConnection.execute(script, in); for ( r : result.getIteration(“msg”) ) if ( r.getBoolean(“A”) ) print( r.get(“B”) & “ Deleted” ); else print( r.get(“C”) & “:” & r.get(“D”) );
Use results 22
ISEC 2010 The University of Texas at Austin
Forest Structure == Control flow for (x : r.Items) { print( x.Name ); for (y : x.Parts) print( y.Name ); } items Name “Engine” “Hood” “Wheel”
Parts
Name “Cylinder” Name Name “Tire” “Rim” 23
ISEC 2010 The University of Texas at Austin
Exceptions Server Exceptions Terminate the batch Return exception in forest Exception is raised in client at same point as on server
Client Exceptions Be careful! Batch has already been fully processed on server Client may terminate without handling all results locally
24
ISEC 2010
Batches Client
The University of Texas at Austin
Batch statement: compiles to Local/Remote/Local Works in any language (e.g. Java, Python, JavaScript) Platform independent
Server Small engine to execute scripts Call only public methods/fields (safe as RPC) Stateless, No remote pointers (aka proxies)
Communication Forests (trees) of primitive values (no serialization) Efficient! 25
ISEC 2010 The University of Texas at Austin
Batch Script Language e ::= x | c variables, constants | if e then e else e conditionals | for x : e do e loops | let x = e in e binding | x = e | e.x = e assignment | e.x fields | e.m(e, ..., e) method call | e…e primitive operators | inX | outX e parameters and results | fun(x) e functions = + * / % ; < > & | not Agree on script format, not on object representation 26
s
30 yea r
The University of Texas at Austin
ISEC 2010
SQL
C B OD EJB / O J D NQ LI
Objects... Persisted 27
ISEC 2010 The University of Texas at Austin
List name/size of large files JDBC // create a remote script/query String q = “select name, size from files where size > 90”; // execute on server Statement st = conn.createStatement(); ResultSet rs = st.executeQuery(q); // use the results while ( rs.next() ) { print( rs.getString(“name”) ); print( rs.getInteger(“size”) ); } 28
ISEC 2010 The University of Texas at Austin
List name/size of large files JDBC
Uses the batch execution pattern!
// create a remote script/query String q = “select name, size from files where size > 90”; // execute on server Statement st = conn.createStatement(); ResultSet rs = st.executeQuery(q); // use the results while ( rs.next() ) { print( rs.getString(“name”) ); print( rs.getInteger(“size”) ); }
29
ISEC 2010 The University of Texas at Austin
List name/size of large files LINQ // create the remote script/query var results = from f in files where size > 90 select { f.name, f.size }; // execute and use the results for (var rs in results ) { print( rs.name ); print( rs.size ); }
Programmer creates the forest
30
ISEC 2010 The University of Texas at Austin
List name/size of large files Local objects for ( File f : directory.Files ) { if ( f.Size > 90 ) { print( f.Name ); print( f.Size ); } }
31
ISEC 2010 The University of Texas at Austin
List name/size of large files Batches batch ( Service directory : service ) { for ( File f : directory.Files ) { if ( f.Size > 90 ) { print( f.Name ); print( f.Size ); } } }
That's it! SQL generation is automatic Why are queries “more high level”? 32
ISEC 2010
Batches for SQL Batch compiler creates SQL automatically
The University of Texas at Austin
Efficient handling of nested of loops Always a constant number of queries for a batch
Supports all aspect of SQL queries, updates, sorting, grouping, aggregations SQL is “assembly language of data”
Summary Clean object-oriented programming model Efficient SQL execution
33
ISEC 2010 The University of Texas at Austin
“Whatever the database programming model, it must allow complex, dataintensive operations to be picked out of programs for execution by the storage manager...” David Maier, 1987
34
ISEC 2010 The University of Texas at Austin
RD BM SQL S
one two call calls
C P R BA R O C MI R
Batch (SQL)
Where do Services fit?
35
10
s
r yea
The University of Texas at Austin
ISEC 2010
XML Web Serv ices
Web Services...
HTML 36
ISEC 2010 The University of Texas at Austin
RD BM SQL S
one two call calls
Batch (SQL)
C P R BA R O C -RPC L M X MI R
HTML XML
37
ISEC 2010 The University of Texas at Austin
RD BM SQL S
one two call calls
Batch (SQL)
C P R BA R O C -RPC L M X MI R
Web Service HTML XML
38
ISEC 2010 The University of Texas at Austin
Amazon Web Service XYZ 1 2 ASIN SalesRank Images
ISEC 2010 The University of Texas at Austin
Amazon Web Service XYZ Custom-defined language for each service operation 1 2 ASIN SalesRank Images
ISEC 2010 The University of Texas at Austin
Amazon Web Service XYZ for (item : Items) { ... 1 2 ASIN SalesRank Images outA( item.SalesRank )
outB( item.Images )
ISEC 2010 The University of Texas at Austin
Standard Web Services // create request ItemLookupRequest request = new ItemLookupRequest() ; request.setIdType("ASIN"); Method names request.getItemId().add(1); in strings request.getItemId().add(2); request.getResponseGroup().add("SalesRank") ; request.getResponseGroup().add("Images") ; // execute request amazonService.itemLookup(null, awsAccessKey, null, associateTag, null, null, request, null, operationRequest, items) ; Batch execution // use results pattern (again!) for (item : item.values) display( item.SalesRank, item.SmallImage );
ISEC 2010 The University of Texas at Austin
Batch Version of Web Service // calls specified in document batch (Amazon aws : awsConnection) { aws.login("XYZ"); Item a = aws.getItem("1"); display( a.SalesRank, a.SmallImage ); Item b = aws.getItem("2"); display( b.SalesRank, b.SmallImage ); } Fine-grained operation model Coarse-grained execution model
ISEC 2010
one two call calls
Batch (SQL)
C P R BA R O C -RPC L M X MI R
=
The University of Texas at Austin
RD BM SQL S
Web Service HTML XML
44
ISEC 2010
Web Service calls are Batches “Document” = collection of operations
The University of Texas at Austin
Custom-defined language for each service operation Custom-written interpreter on server
Batches factor out all the boilerplate Multiple calls, binding, conditionals, loops, primitives...
45
ISEC 2010
What about Object Serialization? Batch only transfers primitive values, not objects
The University of Texas at Austin
But they work with any object, not just remotable ones
Send a local set to the server? java.util.Set local = … ; batch ( mail : server ) { service.Set recipients = local; // compiler error mail.sendMessage( recipients, subject, body); }
46
ISEC 2010 The University of Texas at Austin
Serialization by Public Interfaces java.util.Set local = … ; batch ( mail : server ) { service.Set recipients = mail.makeSet(); for (String addr : local ) recipients.add( addr ); mail.sendMessage( recipients, subject, body); }
Sends list of addresses with the batch Constructors set on server and populates it Works between different languages
47
ISEC 2010 The University of Texas at Austin
Interprocedural Batches Reusable serialization function @Batch service.Set send(Mail server, local.Set local) { service.Set remote = server.makeSet(); for (String addr : local ) remote.add( addr ); return remote; }
Main program batch ( mail : server ) { remote.Set recipients = send( localNames ); 48
ISEC 2010
Batch = One Round Trip Clean, simple performance model
The University of Texas at Austin
Some batches would require more round trips batch (..) { if (AskUser(“Delete ” + msg.Subject + “?”) msg.delete(); }
Pattern of execution OK: Local → Remote → Local Error: Remote → Local → Remote
Can't just mark everything as a batch! 49
ISEC 2010
Transactions and Partial Failure Batches are not necessarily transactional
The University of Texas at Austin
But they do facilitate transactions Server can execute transactionally
Batches reduce the chances for partial failure Fewer round trips Server operations are sent in groups
50
ISEC 2010 The University of Texas at Austin
Order of Execution Preserved All local and remote code runs in correct order batch ( remote : service ) { print( remote.updateA( local.getA() )); // getA, print print( remote.updateB( local.getB() )); // getB, print } Partitions to: input.put(“X”, local.getA() ); // getA input.put(“Y”, local.getB() ); // getB .... execute updates on server print( result.get(“A) ); // print print( result.get(“B”) ); // print Compiler Error!
51
ISEC 2010
Available (Almost) Now... Jaba: Batch Java
The University of Texas at Austin
100% compatible with Java 1.5 Transport: XML, JSON, easy to add more
Batch statement as “for” for (RootInterface r : serviceConenction) { ... }
Full SQL generation and ORM Select/Insert/Delete/Update, aggregate, group, sorting
Future work Security models, JavaScript/Python clients
Edit and debug in Eclipse or other tools Available now!
52
ISEC 2010
Opportunities Add batch statement to your favorite language Easy with reusable partitioning library
The University of Texas at Austin
C#, Python, JavaScript, COBOL, Ruby, etc...
What about multiple servers in batch? Client → Server* Client ↔ Server
Client → Server → Server
Monadic interpretation? MPI, GPU programming Asynchrony and Streaming 53
ISEC 2010
Microsoft LINQ
Related work
Batches are different and more general than LINQ
Mobile code / Remote evaluation
The University of Texas at Austin
Does not manage returning values to client
Implicit batching Performance model is not transparent
Asynchronous remote invocations Asynchrony is orthogonal to batching
Automatic program partitioning binding time analysis, program slicing
Deforestation Introduce inverse: reforestation for bulk data transfer
54
ISEC 2010 The University of Texas at Austin
Universal Domain-Specific Language
C P R Ease of programming
SQL
batch
Web Service Orientation: Service Stateless Cross-platform
55
ISEC 2010
Conclusion Batch Statement General mechanism for partitioned computation
The University of Texas at Austin
Unifies Distributed objects (RPC) Relational (SQL) database access Service invocation (Web services)
Benefits: Efficient distributed execution Clean programming model No explicit queries, stateless, no proxies Language/transport neutral
Requires change to language!
56