A Framework to build an object oriented mathematical ...

3 downloads 14399 Views 693KB Size Report
Page 1. A Framework to build an object oriented mathematical tool with computer algebra system ... changing any core existed module, that is just create a.
International Conference on Semantic Computing

A Framework to build an object oriented mathematical tool with computer algebra system (CAS) capability Dil Muhammad Akbar Hussain, Ole Thomsen Buus, Fsehazion Kiros Nicholas Wichmann, Balatharan Selvarajah, *Zaki Ahmad Information and Security Analysis Research Center Department of Computer Science and Media Technology, Aalborg University Denmark [email protected] * School of Electronic, Communication and Electrical Engineering, University of Hertfordshire Email: [email protected] mathematical equations and expressions in symbolic form, as opposed to manipulating the approximations of specific numerical quantities represented by those symbols. Such a system might be used for symbolic integration or differentiation, substitution of one expression into another, simplification of an expression, etc. One typical use of symbolic mathematics is in software testing (symbolic execution) to analyze if and when errors in the code may occur. It can be used to predict what code statements are performed to specified inputs and outputs. Obviously, when statements are not purely mathematical it could result in unpredictable behavior.

Abstract

Computer Algebra System (CAS) applications are mathematical applications developed with the purpose of solving mathematical problems which are too difficult or even impossible to solve by hand. Modern versions of CAS applications are known for their rather large set of features such as support for graphical representations of results, symbolic manipulation, big-integer calculations, and complexnumber arithmetic. The proposed idea in developing framework for a mathematical tool with CAS capability is derived from the motivation of object oriented design that integrate each application process in terms of an independent object module. Additional motivation is scalability and interface, meaning more functionality can be integrated without changing any core existed module, that is just create a new object and plug in.

Typical symbolic manipulation activities are; ¾ Substitution of symbolic or numeric values for expression. ¾ Change of form of expressions: expanding products and powers, rewriting as partial fractions, rewriting trigonometric functions as exponentials, etc. ¾ Differentiation and Integration. ¾ Partial and full factorization, Infinite series expansion. ¾ Matrix operation including product and inverse. ¾ Solution of linear and some non-linear equations over various domains.

1. Introduction Computer algebra system is a software program that facilitates symbolic mathematics [1][2][3][4][5][6]. The core functionality of a CAS is manipulation of mathematical expressions in symbolic form. The expressions manipulated by the CAS typically include polynomials in multiple variables, standard trigonometric and exponential functions, various special functions for example gamma, zeta, Bessel, etc. and also arbitrary functions like derivatives, integrals, sums, and products of expressions. Symbolic manipulation could be performed in many ways. The following references provides a great many details about various aspects of mathematical tools, computer algebra systems [7][8][9][10]. Symbolic mathematics relates to the use of computers to manipulate

0-7695-2997-6/07 $25.00 © 2007 IEEE DOI 10.1109/ICSC.2007.21

2. Mathematical Tool The design procedure for our mathematical tool not only includes authors own experience but also partial feedback from colleagues. The final application is a mathematical tool based on object oriented design

45

concept supporting a subset of the capabilities seen in a real CAS application. Traditional professional CAS has a long list of functionality features and obviously it is not the intension to incorporate all the functionalities rather the intension is to develop more structured application design. However, the implementation is performed in such a way that extension to integrate all these components is possible. Some key features of a professional CAS application design: ¾ ¾ ¾ ¾ ¾

method which corresponds to the application entrypoint. It also contains the main GUI and it is thus interface class which receives the input-string entered by the user in a TextBox GUI Component.

User defined functions Infinite sets Stepwise view of set calculations Big numbers Lazy evaluation

User defined functions and infinite sets are not implemented as in a full featured CAS application. However, we have incorporated the functionality to support big numbers. CAS applications like Mathematica support a view of big size numbers instead of the less readable scientific notation e.g. 10e20. By using a special technique, integer or double data types can be represented as strings. This allows calculation on big numbers that are too large to fit in integral types. Lazy evaluation is a concept used in computer programming that attempts to minimize the work a computer has to perform. The idea is to follow an evaluation strategy in which an expression is only evaluated until the point where its final value is known. This means that in some cases it is not necessary to evaluate all the parts of an expression. From a performance point of view, lazy evaluation would be useful in relation to set or list calculations. The following expression is interesting when implementing lazy evaluation, ∩ is the intersection symbol: {2,1} ∩ {3,4} ∩ {6,3} ∩ {2,5} ∩ {1,6} ∩ ...

Figure 1: Model of the Interpretation Process The process of interpretation thus begins in AueCASApplication where the input-string is delivered to the Parser class. There the Calitha C# Gold Parser Engine will use the CGT file created by the parser to create a parse-tree [14]. The created parse-tree is sent to the Kernel instance where the first pre-order traversal is performed on it. In Figure 1 this is denoted as phase 1. This traversal produces a tree of objects of type Expression used by phase 2 where that object-tree is traversed with the intent of finally producing a result. This result is a string which is delivered back to the AueCASApplication instance and thus shown in the result field of the GUI. The use of the Calitha Engine is implemented in the Parser class. Among many others, two important classes exist in the Calitha Engine, known as NonterminalToken and TerminalToken (inheriting from Token). When input needs to be parsed the Parser.InvokeParser(…) is called with the input-string as the only argument. This method returns either an instance of NonterminalToken if parsing was successful or the value null if it was not. The NonterminalToken object corresponds to the rootnode of the parse-tree and contains a property Tokens (of type Token[]) which is used to access the branches of that node/token. The k child-nodes of the Token object token, are thus accessed from left to right using token.Tokens[0], token.Tokens[1], …, token.Tokens[k1], etc. This is important since the traversal of the parse-tree uses this mechanism. The non-terminal nodes of the parse-tree are the result of a reduction done on the basis of a production rule defined in the

Since the first part of the expression {2,1} ∩ {3,4} give the empty set { } there is no need for evaluating the rest of the expression. The final result is guaranteed to be an empty set whatever the rest of the expression contains. Which means computer can stop evaluating immediately after the first part is calculated. The concept is also beneficial in evaluating logic operations like AND or OR etc.

3. Implementation Figure 1 shows an overview of the general process of interpretation which has been implemented in C# [11][12][13]. Three classes are used in the illustration: AueCASApplication, Parser, and Kernel. The AueCASApplication class contains the main (…)

46

implemented context-free grammar. The rule with which a NonterminalToken object token was reduced by is accessed using the Rule property. The Rule property (returning type Rule) contains the Id property (returns int), state and rule-id that is being used to reduce to the current node, e.g. token.Rule.Id. After knowing the right-hand side of each production rule, the rule-id contains the necessary information needed to traverse the tree correctly. The idea with the Kernel class (or just kernel) is to contain the source code i.e. the mechanism which does the actual calculation. The idea with having a single or several kernel instances, responsible for the calculation, is not uncommon to CAS applications. It also does make sense in an objectoriented view, where every part of an application should be described by a single or a group of objects. The interpretation process is believed to be one large but single part of our CAS application however, recognizable as one single object a kernel.

Listing 1 Figure 2 shows how the creation of objects is performed by traversing a parse-tree. The structure of the parse-tree can correspond to several forms of input using a binary left associative operator (e.g. 1+2-4, 1+1+1, etc.) but this is not important since the object creation can be explained without such detail. A more specific explanation on how algebraic calculations are done is given in the next section. Each white node is a non-terminal and each gray node is a terminal. Each node visited in the traversal contains a number stating the order of which that node is visited. Importantly, trimmed reductions are used in this tree and also in the implementation. It will be evident later that use of trimmed reductions minimizes the copying of objectreferences when traversing the tree.

Though, the kernel is still quite complex and has already been mentioned as containing two distinct procedures: Phase 1 known as the object-tree creation and phase 2 known as evaluation-object-tree are explained in this section. As explained earlier the parsing and interpretation are two distinct procedures, both initiated from the class AueCASApplication. A single instance of the classes Parser and Kernel are created in the constructor of AueCASApplication. The parsing is done by calling the Parser.InvokeParser(…) using the input-string as argument. After the parsing is done the Invoke(…) method in the kernel is called. This is known as the invocation of the kernel and it sets in motion the entire procedure necessary to produce a result. The most important code lines in the Invoke(…) method are shown in Listing 1 together with another method known as CreateObject(…). It can be seen that the single argument token is a NonterminalToken object, which is the one received from the Calitha Engine when the parsing is done. Line 5 shows the call to the method CreateObject(…) taking token as argument. This call will begin the creation of the object tree which is returned as an Expression object. In line 6 the call to method ComputeResult(…) corresponds to the initiation of phase 2, where the object-tree is traversed. This method is explained in the next section. The method CreateObject(…) checks on the received Token object to see if it is a TerminalToken or not (which means it is a NonterminalToken), and calls another method based on this distinction (Listing 1, line 13 and 17). Those two methods will be explained shortly.

Listing 2 shows three additional methods used for creating the objects in the object-tree. When the CreateObject(…) method is called from the Invoke(…) method, it will eventually call the CreateObjectFromNonTerminal(…) method and enter the switch beginning at line 3. When inspecting the parse-tree of Figure 2 it is seen that the root-node was reduced by rule 0. Thus, case 0 in the switch will be the one used (line 5). At line 6 the first instantiation of an object of type Expression_Object is made, having a constructor that takes two arguments. In fact, these two arguments initiate the object creation of the nodes further down the parse-tree. Importantly, the Expression_Object is also a broad generalization and implementation of this instance is made of one of the non-abstract classes, inheriting ExpressionCompound. At line 7 and 8 it is shown that method CreateExpression(…), is used. That method is defined in line 26-29 and has the simple task of making a recursive call to CreateObject(…) and then properly type-casting the returned object as an Expression. The argument to CreateExpression(…) in line 7 and 8 has the result that only the two outer child-nodes of the root-node are visited in the traversal (branches 0 and 2). It is not necessary to visit the middle child-node since it is the operator-terminal and thus contains no information. The type of operator is known already on the basis of the current reduction rule. The first object

47

created is shown at node 1 in Figure 2 as a black box attached to the node. This object is stated as being an Expression object, which certainly it is, though the actual object is an instance of one of the class inheriting abstract class ExpressionOperand.

Listing 2 No more recursive call are made since terminal nodes cannot have child-nodes and thus the recursive calls begins to close and ascend up the tree, returning whatever object they contain to the prior recursive scope. In Figure 2 this is shown by dotted arrows pointing from node 4 to node 3, and from node 3 to node 2. Note also that these arrows are numbered corresponding to the order of which the objects are returned. At node 2 a call to the constructor of Expression_Object was made, but this call is not yet finished. Only the first argument to the constructor is now done and the next argument is still waiting to be evaluated. This means that recursive calls are now made on the right child-node of node 2, and the objects created there will also be returned in the same way according to the dotted arrows of Figure 2. When all object have been created and all the recursive calls have returned as well, the first object instantiated at node 1 is returned to the scope of Invoke(…) in the kernel. This object will contain an object-tree of the form seen in Figure 3 which shows the tree structure achieved by traversing the parse-tree in the way explained. The next section will bring focus to the next procedure in the kernel, phase 2.

Figure 2: Object Creation from the Parse Tree In Visual C# .NET first argument to a method is always evaluated first. Thus the left-most child-node of node 1 is the next visited. This node is also a nonterminal and is also reduced by rule 0. Again an Expression_Object object is created, this time at node 2 of Figure 2. The third node visited is also a nonterminal, but reduced using rule 1. Thus, case 1 at line 10, Listing 2, will be used this time. In this general example, rule 1 is known as a single-branch rule, which means that node 3 has only one single childnode. In such a case it is not necessary to explicitly instantiate a new object. Instead, yet another recursive call using CreateExpression(…) on branch number 0 (which is the only one) is made. Any kind of object instantiated further down the tree will eventually be returned back to the current recursive scope which will do the same i.e., return it back to a prior recursive scope until the last return will return the object. Which is used an argument to a constructor e.g. the one used in node 2. The fourth node visited is a terminal with symbol id 1 (in this general example). The last recursive call of CreateObject(…) called the CreateObjectFromTerminal(…) method because the single child-node was this terminal. CreateObjectFromTerminal(…) is defined from line 16 to 24 (Listing 2) and case 0 (the only one) of the switch (line 18-23) will be used. In line 21 a new instantiation of object Expression_Object is made and returned to the prior recursive scope. Importantly, the object instantiated on the basis of symbol id in the implementation always inherits from ExpressionOperand since that is exactly what such an object is: an operand to an operator in an expression compound.

4. Evaluation Object Just after the object-tree or syntax-tree has been created a call to the method ComputeResult(…) is made with the root-object of the object-tree as argument. A selection of the ComputeResult(…) method is showed in Listing 3.

48

the object-tree. That Evaluate() method will thus not make yet another internal call to an Evaluate() method, but instead return the entire object to the scope of the second Evaluate() call.

Figure 3: Resulting Object Tree after Traversal Listing 4

Listing 3 To calculate a result using the object-tree the object in the tree are evaluated using the Evaluate() method. In line 4 of Listing 3 a call to exactly that method is made and the result is returned to the variable resultObject of type object. A look into the abstract Expression class (Listing 4, line 4) shows that the Evaluate() method is defined as abstract. This means that all objects inheriting from Expression will need to either define a body of Evaluate() or override it as abstract. The call to the Evaluate() method in Listing 3 at line 4 initializes a preorder traversal of that newly created object-tree. This traversal is made because this initiating call to Evaluate() is made on the root-node object of the object-tree and will eventually call the Evaluate() method on the left child-node object. This progression of Evaluate() calls will traverse the tree in preorder i.e., exactly as when the object was created from the parsetree, and this is the process which is known as the object-evaluation process. Note that the Expression class also defines an abstract property ExpressionResult with get and set access.

Figure 4: Object Tree Evaluation Process Model

5. Set and List Implementation This section offers a broad overview on the implementation of the two non-atomic operands: Set and list. Code examples are not included with the intent of only presenting a broad overview and at the same time limiting the length of the section. The inheritance hierarchy of the classes involved in the implementation is shown in Figure 5. The two classes NonAtomicSet and NonAtomicList correspond to either a set or a list and are instantiated at certain nonterminal nodes in the parse-tree. This section will introduce the parse-tree resulting from entering the set {1,2,3}, and explain the objectcreation process in broad detail. Entering a list of the form [1,2,3] does not add much difference to the resulting parse-tree and an example is thus omitted. The one subtle difference of the two parse-trees would be due to different production rules and their use of different terminals. Sets are delimited by braces ({ and }) and lists are delimited by square brackets ([ and ]). A set has no information about the order of elements in contrast to lists (i.e. ordered sets, n-tuples) where every element has the order given when the list was originally defined. When entering a set as an expression in the CAS application, the resulting set will not contain duplicates and the order of atomic elements (numbers) will be stored in ascending order. With list, duplicates always remains. Importantly, sets

Figure 4 shows an illustration on what happens during the evaluation. As mentioned the call to evaluate from ComputeResult(...) will begin a series of calls to the different Evaluate() method defined in the object created in the object-tree. It also includes information about the order of the calls to Evaluate() (white circles). Another important aspect or detail of the evaluation process is what happens when an Evaluate() call returns. Actually the entire object present at the specific node is returned to the scope of the prior Evaluate() call. In Figure 4 this is illustrated by dotted arrows leaving each object representing an object return. These are also ordered, denoted with a gray circle. The first object return is done when the third Evaluate() call has been made at the left-most leaf in

49

and lists are able to contain sets and lists as elements with no explicit limit of the nesting-level.

denoted e 3 to e (n-2). It is shown that elements e 0 and e 1 (i.e. the first two elements) indeed are situated at the lowest level of the parse-tree, and the coalescing process will collect these two elements in one single data-structure. The successions of the coalescing procedure is illustrated in Figure 7 using a box to the left or right of each sub-tree containing elements, listing the elements coalesced so far. As implied this process is generic to both sets and lists. The difference in implementation is obvious when the collected elements are used in either a set or a list. The actual situation is determined by the type of object created as the root-node object which can be either NonAtomicSet or NonAtomicList.

Figure 5: Inheritance Hierarchy of Classes for Sets and Lists

7. Storing Sets and Lists The prior section presented the use of inheritance with intent of reusing code. The situation is the same with the classes seen in Figure 5 where the class OperandNonAtomic defines data-structures and mechanism common to all non-atomic operands, and is thus the parent-class of NonAtomicSet and NonAtomicList. Something common to the implementation of sets and lists is the way the objects are stored and accessed. Table 1 show the general approach where the overall data-structure is a multidimensional with seven elements i.e. indexes in the first dimension. This seven-index list is implemented as an ArrayList which is a data-structure where each element is of type object. Thus if anything else is stored in such a list e.g. a series of ArrayList objects, appropriate type-casting will have to be applied when accessing them. Atomics i.e., numbers are stored at index 0 as shown in Figure 7. In sets, this index will contain a special data structure known as SortedSet. This class originates from a third-party .NET assembly implemented with the intent of sorting numbers more efficiently.

Figure 6 shows the resulting parse-tree when entering a set with the atomic elements 1, 2, and 3. When corresponding rule is met in the tree an instantiation of an object of type NonAtomicSet is done. An interesting detail is the creation of an ElementCouple object with a given production rule. This is a special non-atomic expression compound. Though it involves no use of any true operator, it still presents a node with three child-nodes where the middle is a terminal node and the two others are non-terminal nodes. This structure resembles the structure achieved when using a binary operator and this is why the ElementCouple class inherits directly from NonAlgebraicBinary as shown in Figure 5. When initiating the evaluation process of the created object-tree, the Evaluate() method of the ElementCouple objects has a special task know as element coalescing to perform. When looking at Figure 6, the first and second element (read from the left) is at the bottom of the tree, while the third element is situated higher in the tree.

6. Coalescing Elements

SortedSet does not remove duplicates and thus a manually implemented algorithm is used for this purpose. In lists, index 0 contains another ArrayList since there is no need for sorting. Index 1 is used for all the sets contained in the particular set or list. No sorting is done of non-atomic elements in sets. Interestingly index 1 contains an ArrayList, and each element in that list is actually yet another seven-index ArrayList like the one shown in Figure 7.

To get these above elements into some kind of datastructure, a method is necessary to collect them in succession, where the first two elements are coalesced into a data-structure and later that data-structure is once again coalesced with the third element, producing a second data-structure containing all elements of the set (or list). Figure 7 shows a general parse-tree resulting from a set or list with n elements denoted e 0 to e (n-1). The blurry sub-tree seen in the parse-tree was found to be a convenient way to illustrate a series of left sub-trees, corresponding to the indefinable number of elements

8. Non-algebraic Operations Four non-algebraic binary operations are implemented in our CAS application:

50

¾ ¾ ¾ ¾

the operation is made on two sets, will first iterate through the atomic elements, then the sets and the lists. Non-atomic elements are compared by comparing the string representations. When doing intersection on lists, the seventh index of the data-structure is used.

Union Intersection Set difference Cross Production

In the following, these four operations will receive a short note on implementation beginning with union. It should be noted that binary operations are only defined if the two non-atomic operands are of the same type i.e. either set or list (combinations are not allowed). The union of two sets is done in a straightforward manner using the first three indexes of the two sevenindex data-structures. The operation is left-associative and thus the data-structures of the left-operand are copied to an empty data structure and the same happens with the data-structure of the right operand.

Figure 7: Coalescing of Elements

Table 1: Data Structure for a Set or List The set difference of two sets is done using the same principle of iteration as done with intersection. Instead of checking on cardinality, the left operand is always chosen as the one to be iterated. Prior to iteration a copy of the two data-structures are made, with the intent of being able to delete from them later. The iteration will check for identical elements in the right operand. If one is found it is deleted from the result set or list. The cross product of two sets or lists consists of either a set or a list containing lists which again contain two atomic elements, corresponding to a binary relation. The cross product is constructed by first choosing the left operand as the one to be iterated. The atomics of the left operand are iterated, then the sets and finally the lists. In each of the three iterations, a binary relation is created for each element existing in the right operand, in the order of atomics first, the sets

Figure 6: Object Creation Process for a Set The situation is almost the same with the union of two lists, though the seventh index is used instead. Intersection is somewhat more intricate. The basic idea is to choose one of the two operands, and then iterate through the elements of that operand, checking if each element also exist in the other set or list. If that proves to be the case, that element is added to the result list or set. The set or list with the smallest cardinality is chosen as the iterated operand, or if the set or list has the same cardinality the left operand is chosen. This implies that, if an operand is the empty set or list, it will always be the one chosen. This could resemble some form of semi lazy-evaluation. The intersection, if

51

and then lists. A sample screen shot of the GUI developed for the framework is shown in Figure 8.

brackets and nuclear matrix elements Computer Physics Communications, Volume 173, Issue 3, 15 December 2005, Pages 140-161.

9. Conclusion

[4] Kwatny and Bor-Chin Chang; Symbolic computing of nonlinear observable and observer forms Applied Mathematics and Computation, Volume 171, Issue 2, 15 December 2005, Pages 1058-1080 Harry G.

A typical belief is that the best object oriented design notations are based on a rigorous mathematical foundation. This mathematical foundation is important for several reasons firstly it explains the precise design content of language constructs secondly, it allows to compare design methodologies in a scientific way. In addition mathematical foundation provides an exact specification of a design methodology, so that all software tools can support the methodology in a standard way. Therefore, as CAS tools are used to solve complex (rigorous) problems, it is reasonable to use the same assumption in building a framework based on object oriented design. The aim of this research was to provide a framework for a structured based mathematical tool development procedure using the fundamental concepts of object oriented design methodology. Although, large number of Mathematical and Computer Algebra System are available and a variety of design and analysis methodologies for object oriented software exist in the literature. However, the motivation here was to use the object oriented methodology for scalability and integration. The track used to proceed with this research was to build each module independently and the paper has shown various modules and their functionality in the over all system. The over all system performance and its capability were tested using a test suite and it exhibited to be working satisfactorily. The results are not shown here, as the intention of the research is not to demonstrate how much it can do rather the model framework development. The scalability of our model looks reasonable as we can expand it by adding more functions however, it has not been tested beyond the functions described here but future work will be in that direction.

[5] Onur Kıymaz and Şeref Mirasyedioglu; A new symbolic computational approach to singular initial value problems in the second-order ordinary differential equations Applied Mathematics and Computation, Volume 171, Issue 2, 15 December 2005, Pages 1218-1225. [6] Aldo Dall’Osso; Computer algebra systems as mathematical optimizing compilers Science of Computer Programming, Volume 59, Issue 3, February 2006, Pages 250-273. [7] Hongguang Fu, Xiuqin Zhong and Zhenbing Zeng; Automated and readable simplification of trigonometric expressions Mathematical and Computer Modeling, Volume 44, Issues 11-12, December 2006, Pages 1169-1177. [8] Dirk Puetzfeld; PROCRUSTES: A computer algebra package for post-Newtonian calculations in General Relativity Computer Physics Communications, Volume 175, Issue 7, 1 October 2006, Pages 497-508. [9] Paulo F.A. Mancera and R. Hunt; Some experiments with high order compact methods using a computer algebra software—Part II (non-uniform grid) Applied Mathematics and Computation, Volume 180, Issue 1, 1 September 2006, Pages 233-241. [10] John Power; Countable Lawvere Theories and Computational Effects Electronic Notes in Theoretical Computer Science, Volume 161, 31 August 2006, Pages 5971. [11] Bo Brinch; #C 1st ed. 2002, ISBN: 87-7843-519-6. [12] James W. Cooper; C# Design Patterns February 2002, ISBN: 0201844532 . [13] Herbert Schildt; C# The Complete Reference 2002, ISBN: 0-07-213485-2.

10. References

[14] Calitha C# Gold Parser http://www.xs4all.nl/~rvanloen/goldparser.html

[1] Manuel Bronstein; Symbolic Integration 1 (transcendental functions), 1997 by Springer-Verlag, ISBN 3-540-60521-5. [2] Joel Moses; Symbolic integration: the stormy decade, Proceedings of the second ACM symposium on Symbolic and algebraic manipulation, p.427-440, March 23-25, 1971, Los Angeles, California, United States. [3] D. Ursescu, M. Tomaselli, T. Kuehl and S. Fritzsche; Symbolic algorithms for the computation of Moshinsky

52

Engine;

Suggest Documents