May 4, 2012 ... Beyond the WAM: Topics in Contemporary Logic. Programming ... 3 The SLG-
WAM: Data Structures and Stack Changes;. 4 The SLG-WAM: ...
Beyond the WAM: Topics in Contemporary Logic Programming Implementation Terrance Swift
May 4, 2012
Terrance Swift
Topics in Logic Programming Implementation
Beyond the WAM: Topics in Contemporary Logic Programming Implementation
Topic 1: Overview and Miscellaneous Topics
Terrance Swift
Topics in Logic Programming Implementation
After the WAM
There are a lot of Prologs around: Sicstus, Quintus, YAP, Chao, SWI, GNU Prolog, B-Prolog, Mercury, HAL, ALS, XSB and many others
Nearly every Prolog implements the WAM The WAM has been known for over 20 years ... so why are there a lot of Prologs?
Terrance Swift
Topics in Logic Programming Implementation
After the WAM There is a lot more to Prolog than just the WAM Builtins for I/O, term manipulation, DCGs etc. Module systems Debuggers Dynamic code Libraries, interfaces, and packages: e.g. Interprolog [Cal04], lpdoc (Ciao [HtCG]), pl-doc (SWI) jasper, database interfaces, and many libraries, interfaces, etc.
APIs for embeddability Memory management (garbage collectors) Arbitrary arithmetic – rational numbers Unicode and other extensions to readers Still, the main ISO standard has been around for 10 years [ISO95] Terrance Swift
Topics in Logic Programming Implementation
After the WAM Much of what keeps many Prologs around is research: Speed optimizations faster WAM execution (e.g. YAP [SC99, dSC06]) native-code compilation [Tay01, van92] (e.g. GNU Prolog [DP01b]) global analysis (e.g Ciao [MCH04], XSB [DRSS96]), indexing (e.g. YAP [], XSB [RRS+ 99]) Annotations and analysis for correctness Constraint libraries: CLP(R), CLP(Q), CLP(FD), CLP(BN) [CC03], Various libraries based on Constraint Handling Rules. Parallelism, multi-threading, and “cluster” Prologs Changes to evaluation methods (e.g. tabling) Syntactic extensions: e.g. HiLog [CKW93, SW95] Semantic extensions: well-founded negation, Generalized Annotated Programs, Object-logics such as FLORA-2, preference logics, etc. Integration with other LP paradigms such as ASP (e.g. Terrance Swift Topics XASP [CSW, ARD08]), abduction, etc.in Logic Programming Implementation
Focus of This Topic We’ll discuss implementation issues arising from Making the WAM into a modern Prolog system Extending Prolog into new directions Focus is on the XSB system (xsb.sourceforge.net) It is a widely used Prolog with approximately 10,000 downloads from site a year since 2000, plus inclusions as a package in other systems – e.g. Debian Linux It has most of the advanced features that other Prologs have It has some advanced features of its own It is partly a UNL product I know it. Terrance Swift
Topics in Logic Programming Implementation
Tentative Overview 1
2
3 4
5
6 7
8 9
Overview: Exception Handling and a Taste of Speeding up the WAM Tabling for Definite Programs: Variance and Subsumption; Scheduling Strategies The SLG-WAM: Data Structures and Stack Changes; The SLG-WAM: Completion, Registers and Instructions; Linear tabling Tabling with Negation: Stratification Theories, SLG Resolution The SLG-WAM: Implementation of Tabled Negation Multi-threaded Prolog: ISO-multi-threading, the MT-SLG-WAM; Concurrent Completion and Shared Completed Tables, YAP Tabling. Dynamic Rules and Facts: Incremental Tabling Constraints and CHR; Tabled constraints; Answer Subsumption Terrance Swift
Topics in Logic Programming Implementation
XSB Background
As a poor measure of system complexity XSB has: ˜ 84,000 lines of C code in engine and support for builtins ˜ 14,000 Lines of Prolog code in compiler ˜ 37,000 Lines of Prolog code for builtins and standard libraries ˜ 70,000 Lines of Prolog and C for packages (including constraint libraries) This is roughly the same as other leading open-source Prologs
Terrance Swift
Topics in Logic Programming Implementation
XSB Background
XSB is currently developed at Universidade Nova de Lisboa and State University of New York at Stony Brook. WAM has Code Space, Heap, Environment Stack, Trail and PDL Prolog systems typically contain other areas: Atom Tables, predicate/structure tables, findall buffers, and much else. XSB (and some other Prologs) splits Environment Stack into Environment Stack and Choice Point Stack The reason for the split is to make it easier to use the choice point stack for scheduling in tabled evaluations
Terrance Swift
Topics in Logic Programming Implementation
XSB Background The code and algorithms discussed in these lectures were developed by (in alphabetical order) Luis de Castro, Baoqiu Cui, Steve Dawson, Ernie Johnson, Juliana Freire, Michael Kifer, Rui F. Marques, C.R. Ramakrishnan, I.V. Ramakrishnan, Prasad Rao, Konstantinos Sagonas, Diptikalyan Saha, Terrance Swift, David S. Warren and others. Terrance Swift
Topics in Logic Programming Implementation
Exceptions How do you handle exceptions in C? 1
Declare a C variable of type jmp buf
2
use setjmp(jmp buf env) to assign a forward continuation (i.e. environment) referred to by env For example, if you wanted to longjump to the WAM instruction dispatch loop: if (setjmp(xsb abort fallback environment)) Restore the default signal handling; set P register to saved value else P = first instruction of boot module goto *instr addr table[*P]; at beginning of instruction dispatch loop.
3
use longjmp(xsb abort fallback environment,val) to pop C stack to proper environment — the instruction goes back in stack to look for an environment with address matching that maintained in the jmp buf env. Terrance Swift
Topics in Logic Programming Implementation
Exceptions ISO-style exceptions are an elegant part of Prolog and are often under-appreciated1 To throw an exception, call throw(Ball): Ball is simply a Prolog term To handle an exception, call a goal through catch(Goal,Ball,Handler) This calls Goal and if Goal throws an exception (via throw(Ball)) that unifies with Ball, call Handler if Goal throws an exception that does not unify with Ball nothing happens, the exception is handled by an ancestor catch
Thus, there are three continuations in Prolog: forward (success), failure, and exception. Note that continuations may be removed via !/0. catch/3 important when a Prolog is part of a system: e.g. called from Interprolog, from within a Ruby, Delphi, or C process. 1
Terrance Swift
Topics in Logic Programming Implementation
Exceptions How can exceptions be implemented? catch/3 is a little like a setjmp(), but relies on unification of Prolog terms, rather than matches of stack addresses. Go up Prolog’s environment stack until you find an environment Env that looks like a catch. Recall that WAM environments contain: A pointer to their parent environment A pointer to the byte-code for their continuation Various permanent variables (+ perhaps another pointer for Tabling)
See if the thrown Ball unifies with something in Env if so reset stacks and call Handler if not, keep going up stack
Terrance Swift
Topics in Logic Programming Implementation
Exceptions
Here’s how we find the environment of a catch catch marker indicates that the current environment is that of catch/3 We aren’t yet addressing the unification of the ball with the second argument of catch/3
Terrance Swift
Topics in Logic Programming Implementation
Exceptions find next catch() temp E = E find an environment that matches catch while (E and cp(E) != catch marker) temp E = parentEnv(E) if ( ! temp E ) xsb exit(”Throw failed because no catcher for throw”) search through choice points until you find the choice point whose cp E value is older than temp E set B to this value search through choice points until you find the trail cell equal to cp TR(B) set TR to this value return(FALSE); Terrance Swift
Topics in Logic Programming Implementation
Exceptions After interning the error ball, throw/1 finds the next catch/3 environment (if any) Stacks are set to this new environment so that when we fail (which acts analogously to longjmp()), we’ll take the failure continuation for catch/3 Recall from a few slides ago that xsb abort fallback environment puts you back to the beginning of the emulator. throw(error term) Intern error term as exception ball(thread,error term) if (find next catch()) xsb exit(”find next catch() failed in xsb throw internal!”) /* Resume main emulator instruction loop */ longjmp(xsb abort fallback environment, fail inst) Terrance Swift
Topics in Logic Programming Implementation
Exceptions
catch(Goal, Catcher, Handler) :set variable catch marker to point to clean up block call(Goal), clean up block. catch( Goal,Catcher,Handler) :xsb thread self(T), ’$$exception ball’(T,Ball), % one fact per thread here so no CP ( Ball = Catcher -> retractall(’exception ball’( )), call(Handler) ; find next catch() ).
A call to catch/3 first makes sure that catch marker points to clean up block (this actually needs to be done at initialization). If find next catch() traverses the environment for a call to catch/3 it will set the environment and choice point stacks, the heap and trail to that environment, and succeeds. If the thrown Ball unifies with Catcher, Handler will be called; othewise find next catch() will be called again. When there are no choice points created by Goal, clean up block removes the choice point set up by catch/3.
Terrance Swift
Topics in Logic Programming Implementation
Catching vs. Cutting
catch/3 finds a previous enviroment and resets stack to that enviroment. After the environment is reset, it calls a handler and succeeds (if the handler succeeds); !/0 finds a previous choice point C , and resets stacks to the enviroment in C by failing. Once the enviroment is reset, a failure continuation is taken.
Terrance Swift
Topics in Logic Programming Implementation
Cleaning Up after a Goal How do you guarentee that a given handler is always called after a goal – whether the goal succeeds (with no more choice points), fails, or throws an exception E.g. to close a db cursor, mutex, etc. As described, catch/3 handles exceptions This is usually done via call cleanup(#Goal,#Handler) Failure of Goal: ?- call_cleanup(fail,writeln(failed(Goal))). In this case, Goal has no solutions, and the handler is invoked when the engine backtracks out of Goal. Deterministic success of Goal. Assume that p(1) and p(2) have been asserted. Then
Terrance Swift
Topics in Logic Programming Implementation
Cleaning Up after a Goal
?- call_cleanup((p(X),writeln(got(p(X)))), writeln(handled(p(X)))). got(p(1)) X = 1; got(p(2)) handled(p(2)) X = 2; no
Terrance Swift
Topics in Logic Programming Implementation
Cleaning Up after a Goal
The hard part is when choice points for Goal are removed via a cut. Again, assume p(1),p(2) are in the database. call_cleanup(p(X),writeln(handled_1)),!. handled_1 X = 1 yes If a cut cuts over more than goal to be cleaned, more than one handler will be executed: ?-call_cleanup(p(X),writeln(handled_4_1)), call_cleanup(p(Y),writeln(handled_4_2)), call_cleanup(p(Z),writeln(handled_4_3)), !. handled_4_3 handled_4_2 handled_4_1 X = 1 Y = 1 Z = 1
Terrance Swift
Topics in Logic Programming Implementation
Cleaning Up after a Goal How do you implement this? The following code handles deterministic success + failure call_cleanup(Goal,Cleanup):get_breg(BregBefore), catch(Goal,E,cleanup_catcher(Cleanup,E)), get_breg(BregAfter), (BregBefore == BregAfter -> % if no CPs left for Goal removes call_cleanup’s ’$$clean_up_block’(BregBefore), call(Cleanup) ; true). call_cleanup(_Goal,Cleanup):call(Cleanup), fail. Note that B register hacking done to recognize deterministic success; also a choice point is explicitly removed (as with catch/3) When cutting over a choice point for call cleanup the choice point must be recognized and the handler called. Terrance Swift
Topics in Logic Programming Implementation
Cleaning Up after a Goal How does the cut recognize that a handler must be called? The following pseudo-code is called by putpbreg or putpbreg (XSB terminology) cut_code(cut_breg) /* restore EB reg and HB reg except for check_complete */ cut_restore_trail_condition_registers(cut_breg); while (cp_prevbreg(breg) => cut_breg) /* add interrupt if breg points to 2nd clause of call_cleanup */ CHECK_CALL_CLEANUP( breg); inst_cut_over = *cp_pcreg(breg); /* throw exception if cutting over a table */ CHECK_TABLE_CUT(inst_cut_over) ; breg = cp_prevbreg(breg); unwind_trail(breg,xtemp1,xtemp2); breg = cut_breg; /* Check for call_cleanup or attributed var. interrupts */ if (pending_interrupts) /* Save env. as continuation then call call_cleanup or constraint handler */ alloc_env_and_call_check_ints(reserved_regs,arity);
Issues involving attributed variables and cutting over tables will be discussed later.
Terrance Swift
Topics in Logic Programming Implementation
Speeding up the WAM How do you handle instruction dispatch in the WAM? inst_dispatch: switch(*P) { case getpvar: : goto instruction_dispatch; case getpval: : goto instruction_dispatch; } Tests have indicated that XSB spends 25-30% of its time in instruction dispatch when this model is used. Terrance Swift
Topics in Logic Programming Implementation
Speeding up the WAM To speed up instruction dispatch, you can create macro instructions E.g. define getlist tvar tvar R0 , R1 , R2 to execute the sequence: getlist R0 unitvar R1 unitvar R2 Most Prologs have a few instructions like this. Analysis of instruction sequence frequency indicates that there aren’t that many “common” cases”
Terrance Swift
Topics in Logic Programming Implementation
WAM Review What does p/3 do?
0x300d44
0x300d50
test_heap, 3, 2000 switchonterm, r1, 0x300d44, 0x300d50 try, 3, 0x300d44 trust, 3, 0x300d50 getnil, r1 gettval, r2, r3 proceed getlist_tvar_tvar, r1, r4, r12 getlist, r3 unitval, r4 unitvar, r3 xsb_execute, 0x300c40, p/3
Terrance Swift
Topics in Logic Programming Implementation
Speeding up the WAM
Native code generation is an extreme form of this: compile e.g. append/3 into constituent instructions. Native code generation can give good speedup, but it also expands space needed for a program (10x in Aquarius or Parma?)) Native code generation means different things for different Prologs: various Prologs implement it: Aquarius and Parma (which are no longer maintained); GNU Prolog; YAP and Ciao also have experimented with native code.
Terrance Swift
Topics in Logic Programming Implementation
Speeding up the WAM Jumptable Interpterter (or threaded interpreter) getpvar: : goto *instr_addr_table[*P]; getpval: : goto *instr_addr_table[*P]; To do this, you need to either program in assembler or have a C compiler that supports label variables. In GCC (cf: Section 5.3 of the GCC manual) the table is set up as follows: instr addr table[byte value of getpvar] = && getpvar; Not available for non-GCC compilers (e.g. Microsoft’s). Terrance Swift
Topics in Logic Programming Implementation
Speeding up the WAM Specialized WAM instructions. Take into account mode, type information Based on global analysis (e.g. Aquarius [van92]) Based on mode declaration (e.g. Mercury) Consider how to specialize getpval Recall getpval is used when a register value is unified with a permanent variable. getpval Vi , Rj register unsigned long * op1 = *(E-i) register unsigned long * op2 = *regs[j] P + sizeof(word) unify(op1,op2) goto *instr addr table[*P] Early increment of P may help instruction speed in pipelined architectures Terrance Swift
Topics in Logic Programming Implementation
Speeding up the WAM If you know the type of e.g. Vi you can replace getpval with an instruction like getpcon Vi , Rj register unsigned long * op1 = *(E-i) register unsigned long * op2 = *regs[j] P = P + sizeof(word) unify constant(op1,op2) goto *instr addr table[*P] This is essentially getcon but unifying a register with a number obtained from a permanent variable rather than code. Specialized unification has only one reference, and can avoid the switch() statement that is usual in full unification (e.g. unify()). Terrance Swift
Topics in Logic Programming Implementation
Speeding up the WAM: Mode Information
unicon Vi register unsigned long * op1 = *(P+sizeof(Cell)) P = P+2*sizeof(word) if (flag == WRITE) H++ = op1 else op1 = *(S++) unify constant goto *instr addr table[*P]
If Xi is known to be free (and not aliased) this can be specialized to a set con If Xi is known not to be free this can be specialized to a non-WAM instruction that we might call bind con bind con Vi register unsigned long * op1 = *(P+sizeof(Cell)) P = P+2*sizeof(word) op1 = *(S++) unify constant goto *instr addr table[*P]
Terrance Swift
Topics in Logic Programming Implementation
Dereference Chains Dereferencing can be expensive, and unification requires one (e.g. getcon), or two dereferences (e.g. getpval). Dereference chains can be shortened by a simple hack: deref(op) while (is_variable(op)) { if (op == *(op)) break; op = (CPtr) *(op); } This resets the chain as we go down it, the next time it will be shorter. Global and local analysis (based on declarations) can be used to avoid unnecessary dereferencing
Terrance Swift
Topics in Logic Programming Implementation
Calling and Executing Optimization may work agains system properties. Consider XSB’s call call PredRef P = P+2*sizeof(word) CP = P if no interrupts P = get entryPoint(PredRef) else check for attributed variable interrupts check for keyboard interrupts check for profiling interrupts check for signal or spy interrupts check for thread cancellation goto *instr addr table[*P]
This instruction is deoptimized Indirect reference to entry point via PredRef to allow dynamic loading of undefined predicates Check of interrupt vector
System properties often weigh against optimization Terrance Swift
Topics in Logic Programming Implementation
Benchmarking Prologs On my Mac I tested the following open-source Prologs YAP version 5.1.3 Ciao version 1.13.0 XSB version 3.2 GNU version 1.3.1 SWI version 5.4.7 (not newest) In addition, on a Linux server I tested Quintus and Sicstus against XSB (both are recent versions).
Terrance Swift
Topics in Logic Programming Implementation
Benchmarking Prologs: Benchmarks I tested 5 benchmarks of D.H.D Warren deriv which constructs the derivative of a polynomial, and benchmarks trailing and untrailing nrev which performs naive reverse and benchmarks tail recursive list traversal that requires creation of environment frames, but not choice points. serialise which assigns an ordinal number to integers in a list, and benchmarks backtracking and creation of large structures on the heap. qsort which quick-sorts a list of numbers and benchmarks backtracking and environment frame allocation query which performs queries against a database of countries, and benchmarks shallow backtracking and indexing. In addition, I tested 3 other benchmarks ackerman which computes the Ackerman function and tests determinacy detection in Prolog conditionals call which tests call/1 assert which tests repeated assertions and deletions of facts Terrance Swift
Topics in Logic Programming Implementation
Benchmarking Prologs: Caveats Relative times are given since the benchmarks were performed on two different machines. Each system was made and tested without any particular tuning. Compilation of Ciao was not fully optimized (OS X is not a preferred platform for Ciao) so Ciao might be faster on, say, Linux. The SWI version was not the newest. No native code generation was performed for GNU prolog, no analysis for Ciao. Thus these times reflect “default” behavior.
Terrance Swift
Topics in Logic Programming Implementation
Benchmarking Prologs YAP GNU-native Ciao XSB GNU SWI Quintus Sicstus
deriv 0.38 0.6 0.63 1.0 1.38 4.85 0.37 0.32
nrev 0.38 0.86 1.02 1.0 2.36 11.6 0.47 0.42
qsort 0.36 0,76 0.77 1.0 2.08 7.51 0.63 0.37
serialize 0.41 0.74 0.85 1.0 1.47 6.45 not tested not tested
query 0.68 0.93 1.01 1.0 2.45 10.4 0.68 0.54
ackerman segfault alloc error alloc error 1.0 alloc error alloc error timeout 0.84
Quintus, Sicstus and YAP have the fastest emulators, Followed by Ciao, XSB, GNU, and SWI Attributed variables are supported in YAP, Ciao, XSB, SWI and Sicstus 2 The ordering changes for the builtin tests (call/1 and assert/1). A Prolog can have a slow emulator, but have fast builtins written in C. SWI has one of the fastest compilers of these Prologs GNU Prolog native codes offer less speedup than Acquarius or Parma, but does not explode in terms of space. However, GNU native code does not allow debugging. The YAP and Sicstus emulators are the fastest (for these tests); SWI and Sicstus are probably the most robust. Usage of these Prologs is not correlated to their speed. 2
GNU supports finite-domain constraints not based on attributed variables. Terrance Swift
Topics in Logic Programming Implementation
call 0.17 2.55 68.2 1.0 2.9 3.1 9.6 1.61
How Important is Speed? Speed is obviously important, but In 1980’s C was the language of choice for many projects In early 1990’s C++ was the language of choice for many projects In early 2000’s Java and C# was the language of choice for many projects Now, Python and Ruby are the language of choice for many projects How does speed compare to system aspects (memory management, interrupt handling)? How does speed compare to a memory footprint? Tabled Prolog (XSB, YAP) is usually much faster than rule-based systems such as Jess, Drools, etc [LFWK09] Prologs are often much slower than at least some functional languages (Lisp, ML, Haskell) particularly for deterministic programs like ackerman Terrance Swift
Topics in Logic Programming Implementation
What is Speed? Is naive reverse (or other Warren benchmarks) a good benchmark of speed? If so, than a fast WAM implementation is important (or native code generation). Is 8-Queens a good benchmark? If so than a fast CLP implementation is important. Is LR parsing a good benchmark? If so, then tabling is important. Is indexing important? What about taking advantage of multi-core processors? What is important is the speed of practical applications, which demand a mixture of all these.
Terrance Swift
Topics in Logic Programming Implementation
Beyond the WAM: Topics in Contemporary Logic Programming Implementation
Topic 2: Tabling for Definite Programs
Terrance Swift
Topics in Logic Programming Implementation
Outline
Motivation for tabling: Why study it? Why add it to an engine? Overview of variance vs. subsumption Scheduling Strategies
Terrance Swift
Topics in Logic Programming Implementation
Motivation
Tabling addresses inadequacies of Prolog (SLDNF) Termination for e.g. Datalog programs ancestor(X,Y):- parent(X,Y). ancestor(X,Y):- ancestor(X,Z),parent(Z,Y). Poor complexity, even when it terminates. sg(X,X) sg(X,Y):- par(X,Z),sg(Z,Z1),par(Y,Z1). over a directed acyclic graph.
Terrance Swift
Topics in Logic Programming Implementation
Motivation
While SLD is complete for Horn clauses, it does not terminate for bounded term-depth programs. e.g. finite number of subgoals and answer substitutions, as with Datalog programs. The following program does not have finite term depth (nor a finite model). p(s(X)):- p(X).
p(0).
The following program does not have finite term depth (but does have a finite model). p(X):- p(s(X)).
Terrance Swift
Topics in Logic Programming Implementation
An Example of SLD Evaluation 0. p(a,X) 1. e(a,X) X=a 3.
X=b 4.
5. e(a,Y),p(Y,X) Y=a Y=b p(b,X)
6. p(a,X)
...
...
e(b,X) X=c
e(b,Y1),p(Y1,X) Y1 = c p(c,X) e(c,X)
e(c,Y2),p(Y2,X)
?- p(a,X) p(X,Y) :- e(X,Y) p(X,Y) :- e(X,Z),p(Z,Y) e(a,a). e(a,b). e(b,c).
Terrance Swift
Topics in Logic Programming Implementation
An Example of SLD Evaluation
The SLD tree contains all solutions (completeness) but is infinite (non-termination) If you simply check for loops, you can achieve termination, but you don’t address complexity
Terrance Swift
Topics in Logic Programming Implementation
An Example of SLG (Tabled) Evaluation
0. p(a,Z)
1. p(a,Z) :- p(a,Y),p(Y,Z)
:- table p/2. p(X,Z) :- p(X,Y),p(Y,Z). p(X,Z) :- e(X,Z),q(Z).
2. p(a,Z) :- e(a,Z),q(Z)
e(a,b). e(a,d). e(b,c). 6. p(a,Z) :- p(b,Z)
3. p(a,b) :- q(b)
5. p(a,d) :- q(d)
q(a). q(b). q(c).
Subgoal Answers
4. p(a,b) :-
11. p(a,c) :-
6a. p(b,Z)
p(a,b) p(a,c)
Incomplete
p(b,Z)
p(b,c)
Incomplete
p(c,Z) 7. p(b,Z) :- p(b,Y),p(Y,Z)
State
p(a,Z)
Incomplete
8. p(b,Z) :- e(b,Z),q(Z) 12a. p(c,Z)
12. p(b,Z) :- p(c,Z)
9. p(b,c) :- q(Z) 13. p(c,Z) :- p(c,Y),p(Y,Z)
14. p(c,Z):- e(c,Z),q(Z)
10. p(b,c) :-
Terrance Swift
Topics in Logic Programming Implementation
An Example of SLG (Tabled) Evaluation
This example uses double recursion (2 occurrences of a/2 in the body). Tabling is more efficient with left recursion. Examples in these slides use Prolog’s top to bottom clause selection strategy and left-to-right literal selection strategy. These orderings are among many that are possible.
Terrance Swift
Topics in Logic Programming Implementation
Definite Programs: SLD vs. Tabling Tabling factors out redundant subcomputations to compute the minimal model of a definite program (expressible as the least fixed point of a monotonic operator). Several formulations for tabling: for example OLDT [TS86], SLD-AL [Vie89], SLG [CW96], Magic [BMSU86] Nodes have the form fail or AnswerTemplate :- GoalList (this form becomes slightly more complex for the well-founded semantics) If a tree T has root A :- A, then A is termed the root subgoal of T . In this formulation, answers are (non-failure) leaf nodes whose goal list is empty. Terrance Swift
Topics in Logic Programming Implementation
Tabling Applications: Grammars
Consider the grammar expr --> expr + term expr --> term term --> term * factor term --> factor factor --> ( expr ) factor --> integer(Int)
Terrance Swift
Topics in Logic Programming Implementation
Tabling Applications: Grammars A translation into Prolog-style DCGs. expr --> term, addterm. addterm --> []. addterm --> [+], expr. term --> factor, multfactor. multfactor --> []. multfactor --> [*], term. factor --> [I], {integer(I)}. factor --> [’(’], expr, [’)’].
The programmer has executed left-recursion elimination and left-factoring. Grammar now has right-associative operators rather than the left-associative operators of the original grammar. i.e. x − y − z now is x − (y − z) rather than (x − y ) − z Terrance Swift
Topics in Logic Programming Implementation
Tabling Applications: Grammars The same grammar using tabling. :- table expr/2, term/2. expr --> expr, [+], term. expr --> term. term --> term, [*], factor. term --> factor. factor --> [’(’], expr, [’)’]. factor --> [Int], {integer(Int)}. Syntactic variant of original grammar. Has no associativity problem
Terrance Swift
Topics in Logic Programming Implementation
Tabling Applications: Grammars Applying tabling to a DCG grammar can effectively give Earley Parsing [Ear70] Supplementary Tabling (Supplementary Magic) can convert the grammar to Chomsky Normal Form3 . i.e., convert a rule a(A, D) ← b(A, B), c(B, C ), d(C , D) which is worst-case O(N 4 ) in the number of grammar symbols, to a(A, D) ← b(A, B), a0 (B, D) a0 (B, D) ← c(B, C ), d(C , D)
Earley Parsing of grammars in Chomsky Normal Form takes at most O(N 3 ) for ambiguous grammars; at most O(N 2 ) for unambiguous grammars; and is linear for a large class of grammars. 3
Implementing Earley’s Dotted Rules. Terrance Swift
Topics in Logic Programming Implementation
Tabling Applications: Dynamic Programming
The knap-sack problem: Given n items, each of integer size ki (1 ≤ i ≤ n), and a knap-sack size K. determine whether there is a subset of the items that sums to K. Find such a subset.
Terrance Swift
Topics in Logic Programming Implementation
Tabling Applications: Dynamic Programming A Prolog solution to the knapsack problem. ks(0,0). ks(I,K) :- I>0, I1 is I-1, ks(I1,K). ks(I,K) :- I>0, item_size(I,Ki), K1 is K-Ki, K1 >= 0, I1 is I-1, ks(I1,K1). item_size(1,2). item_size(3,5).
item_size(2,3). item_size(4,6).
Worst-case complexity is 2I .
Terrance Swift
Topics in Logic Programming Implementation
Tabling Applications: Dynamic Programming A tabling solution to the knapsack problem. :- table ks/2. ks(0,0). ks(I,K) :- I>0, I1 is I-1, ks(I1,K). ks(I,K) :- I>0, item_size(I,Ki), K1 is K-Ki, K1 >= 0, I1 is I-1, ks(I1,K1). item_size(1,2). item_size(3,5).
item_size(2,3). item_size(4,6).
Worst-case complexity is I 2 . Terrance Swift
Topics in Logic Programming Implementation
Tabling Applications: Dynamic Programming But how do you find the subset(s)? ksp(0,0,[]). ksp(I,K,P) :- I>0, I1 is I-1, ks(I1,K), ksp(I1,K,P). ksp(I,K,[I|P]) :- I>0, item_size(I,Ki), K1 is K-Ki, K1 >= 0, I1 is I-1, ks(I1,K1), ksp(I1,K1,P). Because of tabling, ks/2 does not repeat computations. cf. [Man89] pg. 110 for an equivalent imperative solution. Terrance Swift
Topics in Logic Programming Implementation
Tabling Applications: Verification :- table trans/3. % Prefix: Act.P Act-> P trans(pref(Act, P), Act, P). % Choice: P = P1 + P2 trans(choice(P, _Q), Act_a, P1) :- trans(P, Act_a, P1). trans(choice(_P, Q), Act_a, Q1) :- trans(Q, Act_a, Q1). % Parallel: P = Q | R trans(par(P, Q), Act_a, par(P1, Q)) :- trans(P, Act_a, P1). trans(par(P, Q), Act_a, par(P, Q1)) :- trans(Q, Act_a, Q1). % Represent Coactions trans(par(P, Q), tau, par(P1, Q1)) :trans(P, Act_a, P1), trans(Q, Act_b, Q1),comp(Act_a, Act_b). trans(par(P, Q), tau, par(P1, Q1)) :trans(P, Act_a, P1), trans(Q, Act_b, Q1),comp(Act_b, Act_a). % Restriction: P\L Act-> P1\L trans(rest(P,L), Act_a, rest(P1,L)) :trans(P, Act_a, P1),legitimate_action(Act_a, L). % Relabelling: P = Q [f] trans(relab(P, Hom_f), Act_b, relab(P1, Hom_f)) :trans(P, Act_a, P1),map(Hom_f, Act_a, Act_b). % Transitive Redefinition trans(P, Act_a, Q) :- def(P, R), trans(R, Act_a, Q).
Terrance Swift
Topics in Logic Programming Implementation
Tabling Applications: Deductive Databases Redundant subcomputations join(X,Y):- supplemental(X,X2),rel_3(X2,Y). supplemental(X,Y):- rel_1(X,X1),rel_2(X1,Y). rel1 (a,b) (a,c) (a,d)
rel2 (b,e) (c,e) (d,f)
rel3 (e,g) (e,h) (f,i)
In SLD resolution, 8 join operations are performed on the above example, while if supplemental is tabled, there will be only 6 join operations. In extreme cases, redundant subcomputations can lead to exponential data complexity for Prolog. (See the knapsack problem below). Tabling has polynomial data complexity for datalog programs with negation. Under a given search strategy tabling is equivalent to magic evaluation under a semi-naive search strategy [FSW97]. Terrance Swift
Topics in Logic Programming Implementation
Implementation of Tabling join(X,Y):- supplemental(X,X2),rel_3(X2,Y). supplemental(X,X2):- rel_1(X,X1),rel_2(X1,X2). Where join/2 and supplemental/2 are tabled, may be more efficient than join(X,Y):rel_1(X,X1),rel_2(X1,X2),rel_3(X2,Y). A simple optimization consists of folding EDB predicates into new tabled predicates. This is called Supplemental Magic Sets [BR91] or Supplemental Tabling. Both rediscover Earley’s observation that the complexity of grammar processing is proportional to the number of non-terminals on the RHS of a production [Ear70]. Terrance Swift
Topics in Logic Programming Implementation
Applications of XSB Tabling (and Flora [YKZ03]): A Partial List Verification: [RRR+ 97, CDD+ 98, DKRR98, LRS98, DRS00, MRRV00, RRS+ 00, KT02, PRR02, BKPR02, PR04, SS05] Ontology Management and the Semantic Web: [PAE98, DYKR00, LPB02, ZFD+ 03, TDK03, CFJ03, SW03, Swi04, ZFC04, BG05, DHM07] and various products of XSB, Inc., Ontology Works, Inc., and OpportuniTV Inc. Program Analysis: [DRW96, Bou97, CDS98, JS98, SR05] Natural Language Analysis and Data Standardization: [LWFS95, RRS97, RL98, CS02, DJP+ 02] and various products of XSB, Inc Diagnosis: [CP04, AAB+ 04, BRO07], and various products of BBN Inc. Medical Informatics: [GST+ 00, Mun04, MGR04] and various products of MDLogix Inc. and Medicine Rules Inc. Robotics and Collaborative Agents [ALPQ00, Terrance LCK01, LGTH05, LTLH05, SP06] SwiftKF04, Topics in Logic Programming Implementation
Implementation of Tabling: Copy Avoidance Structural recursion is acyclic for Prolog-style terms append([],L,L). append([H|T],L,[H|T1]):- append(T,L,T1). which can be seen to have a right recursive form: append([],L,L). append(Term,L,[H|T1]):- cons(Term,H,T),append(T,L,T1). Consider the query append([a,b,X],[c],Y). The following queries are made append([a,b,X],[c],Y). append([b,X],[c],Y). append([X],[c],Y). append([],[c],Y). Still quadratic in the size of the first argument if you must copy from execution area to table. Other possible solutions: Intern Ground Structures in Table Can, in principle, use structure-sharing techniques for non-ground terms. [dlC93, SR93, CR09]. Terrance Swift
Topics in Logic Programming Implementation
Definite Programs: Tabling Operations From [Swi99a], based on [CW96]. 1
New Subgoal: Let Fn be a forest that contains a non-root node N = Ans :- G , Goal List where G is the selected literal S or not S. Assume Fn contain no tree with root subgoal S. Then add the tree S :- |S to Fn .
2
Program Clause Resolution: Let Fn contain a root node N = S :- |S and C be a program clause Head :- Body such that Head unifies with S with mgu θ. Assume that in Fn , N does not have a child Nchild = (S :- |Body )θ. Then add Nchild as a child of N.
3
Positive Return: Let Fn contain a non-root node N whose selected literal S is positive. Let Ans be an answer node for S in Fn and Nchild be the SLG resolvent of N and Ans on S. Assume that in Fn , N does not have a child Nchild . Then add Nchild as a child of N 4.
4
Completion: Given a completely evaluated set S of subgoals, mark the trees for all subgoals in S as completed.
4
For definite programs, simple resolution is all that is needed. Terrance Swift
Topics in Logic Programming Implementation
Definite Programs: Completion What does Completely Evaluated mean? A subgoal is completely evaluated iff it has all of its possible answers. A subgoal S is completely evaluated when all possible operations have been done on its nodes, and the nodes of trees upon which S depends. A ground subgoal is completely evaluated when an answer is derived for it.
Incremental Completion is necessary for efficient evaluation of programs. We also haven’t yet defined | or SLG answer resolution – these will be defined when we introduce non-stratified negation For now, ignore | and think of SLG answer resolution as the resolution of a fact against the leftmost argument of a goal.
Terrance Swift
Topics in Logic Programming Implementation
Definite Programs: Completion p(a,Z)
p(b,Z)
p(c,Z)
Definition (Subgoal Dependency Graph) Let F be a forest in a SLG evaluation. We say that a tabled subgoal S1 directly depends on a tabled subgoal S2 in F iff neither the tree for S1 nor that for S2 is marked as complete and S2 is the selected literal of some node in the tree for S1 . The Subgoal Dependency Graph of F, SDG(F), is a directed graph V,E in which V is the set of root goals for non-completed trees in F and (Si , Sj ) ∈ E iff Si directly depends on Sj . Terrance Swift
Topics in Logic Programming Implementation
Definite Programs: Completion
There is a function from SLG forests to SDGs. Since SDGs are directed graphs, Strongly Connected Components (SCCs) can be defined for them. Incremental Completion can be performed an SCC at a time, or a set of SCCs at a time.
Terrance Swift
Topics in Logic Programming Implementation
Definite Programs: Variance vs. Subsumption p(f(Y),X,1) and p(f(Z),U,1) are variants as one can be made to look like the other by a renaming of the variables. The term t3 : p(f(Y),X,1) subsumes the term t4 : p(f(Z),Z,1). However, they are not variants. Hence t3 properly subsumes t4 . From a more general perspective, Term T1 subsumes Term T2 if T1 ≥ T2 on the lattice of terms. In this lattice most general unifier is the meet, and the minimum anti-unifier is the join. Other lattices could be defined. For instance, the ordering of terms used by ≥@. Depending on the application, a partial order could be used instead of a lattice – for instance the partial order defined over the isa-relation of an ontology. Exercise: define variance and subsumption in terms of most general unifiers. Terrance Swift
Topics in Logic Programming Implementation
Definite Programs: Variance vs. Subsumption Variance vs. Subsumption in the New Subgoal operation Either variance or subsumption can be used when checking whether a subgoal is the root of a tree in the New Subgoal operation. This is termed call variance (or call subsumption). Call variance is useful for goal-directed queries and meta-interpreters. Call subsumption is especially useful when you want to find the inferential fixed point of a program -i.e. a bottom-up fixed-point. For instance, all pairs with property p(X,Y) in a graph. Application: RDF reasoner (Carlos Dam´asio) Application: Program analysis, where you want to find all properties of a program element (e.g. of a clause variable at the time of a call). Application: evaluating an OWL-style ontology where reasoning by cases is not needed. Terrance Swift
Topics in Logic Programming Implementation
Definite Programs: Variance vs. Subsumption Variance vs. Subsumption in the Positive Return operation Either variance or subsumption can be used when checking whether an answer has been resolved against a subgoal in Positive Return. This is termed answer variance (or answer subsumption). Answer subsumption is useful iff an application has a good semantics for a lattice of answers (i.e Generalized Annotated Programs [KS92]); for Probabilistic and Possibilistic Logics [RS10, RS11]; or for preferring some answers over others (e.g. Preference Logic [CS02, JGM98], or Defeasible logics [WGK+ 09]) (negation can also be used for these).
Terrance Swift
Topics in Logic Programming Implementation
Definite Programs: Local Evaluation We have 4 operations: New Answer, Program Clause Resolution, Positive Return, and Completion – how should we schedule them? We’ve already argued that Completion should be performed as early as possible (incrementally), to save space. Program Clause Resolution should be scheduled as in Prolog, as far as possible. New Subgoal should be performed as soon as a goal is encountered, to accord with Prolog’s search strategy This leaves the question of how to schedule Positive Return w.r.t Program Clause Resolution Definition (Locality property) Let F be an SLG forest. Resolution of an answer A against a consuming node N occurs in an independent SCC of F if the root subgoal for N is in an independent SCC in SDG (F). An SLG evaluation has the locality property if any answer resolution operation applied to a state Fn occurs in an independent SCC of Fn . Terrance Swift
Topics in Logic Programming Implementation
Definite Programs: Local Evaluation 1. p(1,Y)