Pluggable Controllers and Nano-Patterns Yossi Gil
Ori Marcovitch
Matteo Orr`u
Computer Science Dept. Technion I.I.T, Haifa, Israel
[email protected]
Computer Science Dept. Technion I.I.T, Haifa, Israel
[email protected]
Computer Science Dept. Technion I.I.T, Haifa, Israel
[email protected]
Abstract—This paper raises the idea of giving end users the ability to modify and extend the control flow constructs (if, while, etc.) of the underlying programming language, just as they can modify and extend the library standard implementation of function printf and class String. Pluggable Controllers are means for modular design of control constructors, e.g., if, while, do, switch, and operators such as short circuit conjunction (&&) and the “?.” operator of the Swift programming language. We propose a modular, pluggable controllers based, design of a language. In this design there are control constructors which are core, augmented by a standard library of control constructors, which just like all standard libraries, is extensible and replaceable. The control constructors standard library can then follow a course of evolution that is less coupled with that of the main language, where a library release does not mandate new language release. At the same time, the library could be extended by individuals, corporate and communities to implement more or less idiosyncratic Nano-Patterns. We demonstrate the imposition of pluggable control constructors on Java by employing Lola—a Turing-complete and programming language independent code preprocessor.
I. I NTRODUCTION Classical work on programming languages [7], [14], [23] just as textbooks [9], [22] tell us that there are three, and only three, essential Control Constructors: Sequential
Iterative
Conditional
Classically, control constructors are those elements of the programming language that make it possible to assemble commands: We can distinguish between atomic and compound commands. For example, the empty command, the assignment, and the procedure call are the atomic commands of PAS CAL . Compound commands are made of primitive and other smaller compound commands. In PASCAL, Begin. . . end is the sequential constructor, While. . . do. . . is an iterative constructor, and, If. . . then. . . is a conditional constructor. Years blurred the classical schism separating expressions and commands. Control constructors now occur in many operators that make expressions, including “&&” and “||” (standard short circuit), “· ? · : ·” (standard conditional), “,” (the sequential comma operator of C), “||” (the “provide default value”, the P YTHON variant of short-circuit) “or”, “??” (the null coalescing operator of C# ), “?.” (the fluency on null operator of S WIFT), and, “noexcept” (a newly introduced operator of C++, guarding against exceptions), etc. However all such operators can be expressed in terms of the sequential, iterative, and conditional control constructors
c 2017 IEEE 978-1-5090-5501-2/17/$31.00
(henceforth, “SIC”). In fact, SIC define the notion of structured programming [16], in that SIC are sufficient to imposing structure on the unstructured: Any program employing gotos can be automatically converted into an equivalent one which uses only SIC [2]. Concrete languages take liberties and are often creative in deviating from the essence of SIC (both in operators and in keywords). Even though many languages are content with the standard three variants of iterations, P YTHON adds its own idiosyncratic else clause to while. Languages with switch variant of the conditional constructors divide in whether how they deal with fall-through cases. Other variations and diversifications are found at other places, e.g., in the semantics of try. . . catch. . . finally blocks. This means that both language engineers, developers and practitioners are interested in experiencing, experimenting and using new features of programming languages that can be useful to increase their productivity, efficiency and to reduce errors. However, creativity typically slows down with the language’s first release. A case in point is switching on strings in JAVA, that took circa twenty years from proposal1 to implementation2 . It seems to be universally accepted that even small changes to the definition of a (successful) language might have (detrimental) unpredictable effects at least expected places [15, p.497–508]. The prudence of language architects hides the other side of the coin, since also the delays might negatively affect software systems. A point in case is the late introduction of the Generics in Java, that forced developers to use as a substitute, the unsafe Vector available at that time. The contributions of the present paper are the following. With regards to the reported issue of language evolution related to SIC, this essay suggests a different approach called Pluggable Controllers. In this plugin approach, new controllers, other than the host language’s built-in SIC, can be introduced through external libraries of varieties, and developers are able to extend the libraries with their own control constructors. Furthermore, we demonstrate the imposition of pluggable controllers on top of JAVA language, by leveraging the functionalities of Lola, the Language Of Language Amendment. Lola is a language independent, powerful preprocessor and 1 http://bugs.java.com/bugdatabase/view
bug.do?bug id=1223179
2 http://docs.oracle.com/javase/7/docs/technotes/guides/language/
strings-switch.html
447
SANER 2017, Klagenfurt, Austria Early Research Achievements
macro language, that allows developers to augment and amend syntactical constructs of a host language. The presented case study is relative to a specific usage of control constructors, that we call Nano Pattern (from now on nano) in analogy with Micro-Patterns [12] of which the nano concept represents a refinement and an evolution. A nano is a recurring solution adopted by developers, to a common task that involves control constructors. We presume that nanos are abundant in source code since they represent the way developers handle with common programming tasks. Being a solution devised by developers to common tasks, they are eligible to be included as new features during language evolution. Our proposal is that the introduction of new features, such as a new plugin control constructor based on a Nano Pattern, can be done with a pluggable controller approach. Also we are suggesting that, by leveraging the LOLA language, experiments on new language features might be carried on both in an industrial and academic environment, in safe condition. In this respect the present work, by combining the concept of nano, Pluggable Controllers and introducing LOLA, aims at paving a new potentially promising research avenue, outlining some of its potential outcomes. The rest of this paper is organized as follows. Sect. II reminds reader of some basic concepts to establish a common vocabulary. Sect. III introduces Lola, the Language Of Language Amendment, used to introduce and experiment the concept of Pluggable Controllers. Sect. IV presents a practical scenario of the application of Pluggable Controllers based on the concept of Nano-Pattern. Sect. V discusses some practical application and outline some promising avenues for researchers and practitioners. Sect. VI concludes this paper.
language’s core should be made by adding a “?.” operator to it. Pluggable controllers join the trend of parameterization of elements of programming languages. In early versions of many programming languages, and even in contemporary languages such as AWK, standard procedures such as Write were hardwired. PASCAL brought a revolution in that procedures like Writeln, functions like Sin, literals like true, and types like Integer, were pre-defined. A predefined type, function, literal or procedure is available to programmers, but can be overridden by them if necessary. In C, anyone from the end user down to whoever nurtures the evolution of could produce their own version of printf of the library. In JAVA, the atomic types, e.g., int, double and boolean, are built-in, whereas their G O equivalent are pre-defined. Programming languages used to be either untyped or typed, and if typed, only in a certain way. But this was before pluggable type systems [3] and gradual typing [19]. B. Nano Patterns The definition of Nano-Patterns was first introduced by Gil and Maman, at the end of their seminal work on Micro-Patterns [12]. Later on Singer et al. [20] (starting from a work of Høst and Østvold [13]) and Batarseh [1] provided their definitions. The former catalog was used to study the relationship between Nano-Patterns and defectiveness [6] or vulnerabilities [21]. The definition of nano that we adopted in the present work is the following: Nano-Patterns (intuitive definition) A nano-pattern (or nano) is a pattern of (typically less than a dozen) control constructs, which recurs frequently serving common or similar purpose, yet cannot be abstracted easily by a function.
II. BACKGROUND A. Pluggable Controllers Here is an intuitive definition of pluggable controllers: Pluggable Controlllers Controllers should be just like functions and classes found in a library: standardized, yet extendable and replaceable. The term controller here includes in it both classical controls constructs such as while and if · · · else, and operators such as “??” that control the order of evaluation of their arguments. In the pluggable controllers approach, a language has only essential SIC as built-in and is equipped with a standard library of varieties, e.g., in C like language, built-in might be {c1 ; · · · cn }, while (e) c and if (e) c1 else c2 , while standard library is for, do and switch. The library is not sealed. Programmers should be able to add their own control constructors, e.g., adding while (e) c1 else c2 to JAVA. With pluggable controllers, extending switch to strings, could have been made by library evolution rather than a new language version. And, no disturbance to the
Our hypothesis is that nanos are abundant and that the ordinary control constructors of a programming language were designed with a particular, prudent view of their usability and quality. The hypothesis says also that nanos evolve in time, and are constantly born and perfected in many specific domains. For example, it is pretty clear now that there are nanos of iterations, which were not included in the design of traditional languages. The answer is placing examples, including “apply an and/or/sum and other associative operations on Iterables”, “iterate, zipper style, on two lists”, in a library rather than in the language core. By this definition, the control plugin while (e) c1 else c2 captures a nano, or doesn’t capture one, depending on the frequency of the pattern using if and auxiliary variables to capture it. III. I NTRODUCING L OLA Lola [24], the Language Of Language Amendment, is a modern, language-independent, preprocessor and macro language, orientated to language extension. Macros invocation is triggered by a pattern matching engine (relying on regular
448
expressions over tokens) which makes it possible to augment the host language syntax, as experienced by the end programmer, without intervening in the language semantics as seen by traditional language processing tools (compilers, IDEs, debuggers, etc.). Macros are expanded to equivalent constructs written in the original host language syntax, or even in the augmented syntax, to be further expanded by other Lola macros to the original syntax of the programming language such as JAVA or C. Lola can be used for wild purposes such as computing code metrics, enforcing code standards, and adding a C preprocessor functionality to any programming language. But, its main purpose is to allow developers to introduce new keywords, new operators, and generally new syntax to the host language. The input to Lola is the code written in the host language mixed with Lola directives, as happens in other macro languages. But, Lola works as a filter which applies patterns to the stream of tokens, converting it into an output stream according to the directives found in the stream. Rules in Lola specify code patterns which we wish to add to the language and their translation in the domain language. There are two kind of directives: generators, which determine the sequence of host tokens to be returned as output, and lexis that describe the amendments. Lexies are the Lola’s basic elements of computation and contain the instructions that determine the outcome of the Lola execution. The Lola workflow is as follows: Lola tries to match the pattern reported in the ##Find directive. When the match occurs it triggers, for example the ##replace directive reported in Fig. 2, that replaces the found code with a new one. Patterns are extended regular expressions (RE). Whenever a snippet of code matches with an RE, Lola creates a P YTHON reifying object, that can be later on manipulated. The computations performed by a lexi include code replacement but a lexi can also invoke P YTHON code. An example of lexi is reported in Fig. 2. A lexi is structured in different sections and encompass different kinds of tokens: (i) Lola keywords (builtin, host specific and user defined), (ii) host language tokens (structured in a taxonomy, that differentiates, for example, between modifiers, keywords, operators, etc.) and (iii) P YTHON code snippets. Tokens of the host language are reported in an XML configuration file, along with trivia3 (basically, anything which separates tokens, including, for example comments, spaces, new line characters, etc.). As far as possible, Lola syntax strives to attain to the English structure of a sentence, adopting the CamelCase convention for keywords, to make the code easier to read.
controllers. This example refers to a specific nano that we called #however. This nano occurs when it is needed to handle with exceptions as in the reported case of a division by zero. Fig. 1 does not represent (yet) real code, but rather a sketch for an implementation based on Lola. ##Import lolaj.nanos.exceptions; ··· #return n/d #however d == 0 #throws RunTimeError #of "dividing " + n + " by zero";
Fig. 1. JAVA code governed by a LolaJ plugin controller capturing a “however” nano. To avoid collisions with JAVA identifiers and keywords,
all pluggable controllers, prefix their “keywords” with a #, whereas Lola’s own keywords are prefixed with a double hash. This way we can distinguish three different kinds of keywords (i) #return, #of and #import are a keyword of a certain plugin; (ii) return and import are JAVA identifier; (iii) ##import is a Lola keyword. In general, keywords of C++ are placed in a hierarchical namespace such as JAVA. A keyword can be referenced either by its full name, e.g., #lolaj.nanos.exceptions.throws RunTimeError, or by its short name, e.g., #throws, provided that the appropriate ##Import was called to bring this name into the current namespace. New operators are introduced similarly, except that for now they are planned to be in the global namespace only, and disregard precedence, associativity, etc. In Fig. 1, the code begins with an ##import of lolaj.nanos.exceptions a library of nanos of LolaJ. This particular library defines a set of nanos for dealing with cases in which normal execution cannot proceed due to errors. Rereading the Fig. 1 reveals that the instruction lolaj.nanos.exceptions has a presumption: that there are (or that there should be) many cases in the code that a function returns a value, but it can only do so pending on preconditions. If any of the preconditions fails, then an exception with an explicit error message is thrown. This LolaJ library defines the regular expression in Fig. 1 after introducing tokens #return, #however, #throws, and #of, and defines the translation to JAVA by the lexi of Fig. 2 (giving the output Fig. 3 below). ##Find(however) #return ##Expression(r) #however ##Expression(c) #throws ##Identifier(x) #of ##Expression(s); ##replace if (##(c)) throw new ##(x)(##(s)); return ##(r); ##example #return n/d #however d == 0 #throws RunTimeError #of "dividing " + n + " by zero"; ##resultsIn if (d == 0) throw new RunTimeError("dividing " + n + " by zero"); return n/d;
IV. C ASE S TUDY: THE # HOWEVER NANO We demonstrate our vision of pluggable controllers with excerpts of JAVA code intermixed with pluggable controllers. Fig. 1 is an example of JAVA code using the pluggable 3 see,
e.g., in the implementation of the C# [?] compiler, Roslyn https://github.com/dotnet/roslyn/ wiki/Roslyn 20Overview syntax-trivia
Fig. 2. A LolaJ lexi, part of exceptions library of lolaj.nanos, defining the translation of the “however” nano-pattern of Fig. 1 into JAVA (Fig. 3).
449
For simplicity, Fig. 2 is only presented for the case that the #however occurs only once. In general, lexis, the modular unit of Lola can manage any regular expression made of tokens and code. Lola can thus deal with optional clause of if, the list of entries of switch, including the optional default, etc. Reading the code in Fig. 2 bear in mind that Lola employs indentation for bracketing. The figure defines lexi named however a pattern in the code beginning with of #return followed by ##Expression followed by #however, followed by another ##Expression, etc. In this search, #return #however, #throws and #of, are new tokens implicitly defined by the lexi. In contrast, ##Expression was defined somewhere else to match something that look like a syntactically legal JAVA expression, including the requirement that sequence of JAVA tokens is balanced. In writing ##Expression(c) Lola gives the name c to the first match of ##Expression. This name, along with the other names similarly defined, r, x, and s, is included in the ##replace clause of the lexi. The ##example. . .##resutlsIn segment of the lexi is for documentation and declarative, self checking, programming. The lexi checks that the main ##find· · · ##replace works as prescribed for all example pairs. Indeed, the JAVA output of applying this lexi on the code in Fig. 1 gives Fig. 3
##Import lolaj.nanos.visitor ··· #Is stranger #equal this? Print("UR me"); #a null? Print("UR missing"); #kindOf Statement? Print("URN abstract statement"); #an InfixExpression? Print("URN infix"); #a NullLiteral? Print("URA null"); #unknown? Print("I do not know UR"); #done;
Fig. 4.
simplifying C code. Then, the user may wish to define her own control constructors to support declarative tests such as #Tweaking "int i=3;i+=2;" #gives "int i=5;";
Another end programmer, trained in the functional programming school, may use LolaJ to augment the language to support list expressions such as: #sum #apply (_) -> 1./(_*_) #to primes();
And, in intermediate levels of the food chain, we find common libraries such as JAVA 8 streams5 , logging and SQL . With LolaJ, the makers of Mockito could then offer the syntax Iterator i=mock(Iterator.class); #mock Iterator #upon next() #return "Hello," #then "World!" #affirm next() + " " + next() #is "Hello, World\n";
which is describing itself arguably better than the equivalent standard tutorial example6
if (d == 0) throw new RunTimeError("dividing " + n + " by zero"); return n/d;
Iterator i=mock(Iterator.class); when(i.next()).thenReturn("Hello").thenReturn("World"); String result=i.next()+" "+i.next(); assertEquals("Hello World", result);
Fig. 3. The JAVA output of applying the “however” lexi (Fig. 2) on the code with the #however (Fig. 1).
V. D ISCUSSION If JAVA had LolaJ, then augmenting the switch statement could have been made by library evolution rather than a new language version. Likewise, a library evolution could have introduced C#’s ?. operator4 into the community, without any disturbance to the language’s core. Also, with LolaJ, it would be possible to extend the library without touching the language core to support multi-way, nonfall through branching conditional operator. Such an operator might be a nice-to-have to typical end user, but essential in some specific applications. A more specialized version of this conditional may prove indispensable in the implementation of visitors. Fig. 4 is another demonstration of a pluggable controller, this time for dealing with the famous double dispatching problem of the visitor pattern [4], [17], [18]. The presumption behind the figure is that there are nanos for dealing with the visitor pattern, and that they can be captured with the visitor library of constructors in lolaj.nanos. Some language extensions, such as the ?. operator are at the lowest level of the use/reuse food chain of software modules: everyone is expected to eventually use any newly introduced version of the standard library. At the top level of this food chain stands a programmer who may be busy in writing tests for an expert system for 4 https://msdn.microsoft.com/en-us/library/dn986595.aspx
Capturing a nano of dealing with visitors.
Where #inlining. . . #to. . . is a user defined control constructor, which, in this simple case, is syntactic sugar for the following instruction: inliningInto("int i=3;i+=2;", "int i=5;");
Or, with the help of an appropriate fluent API library inlining("int i=3;i+=2;").to("int i=5;");
On the one hand, LolaJ is a special case of syntactic sugaring, which is done well by tools such as SugarJ [8], Racket [10] or O CCAM through Camlp4 [5]. On the other hand, this limitation that LolaJ assumes makes it possible to characterize and concentrate on a rather coherent ensemble of applications of the idea of language extension, namely a definition of a DSL like fluent API. One can imagine many other applications, including purposes such as testing, logging, design-by-contract, etc. In fact, we argue that nanos are in place whenever there is a fluent API library. Some research challenges raised is that of finding a given nano in actual code, and deducing the nanos that exist in it. 5 https://docs.oracle.com/javase/8/docs/api/java/util/stream/ package-summary.html 6 https://gojko.net/2009/10/23/mockito-in-six-easy-examples/
450
VI. C ONCLUSION We introduced the concept of Pluggable Controllers as a way to facilitate the introduction of new constructors by the developers, and explained how it could be implemented using Lola, the Language of language amendment, a powerful preprocessor and macro-language. Lola lets developers augment or even amend language constructors without affecting the language architecture. This argument was illustrated by the example of the #however nano-pattern. Our belief is that nano patterns are recurring, because they present solutions devised by developers to common tasks. We suggest that their abundance in source code should be used as leverage for making software better, and in particular in making libraries of pluggable control constructors. The approach should make it possible for both language engineers and practitioners to safely experiment new language features in a controlled environment. We hold that pluggable control constructors would lead to making discussion more concrete and better, well thought requests of the introduction of new features to the host language. Empirical study is required to make the pluggable controllers idea more feasible and acceptable. Our current empirical study of nano-patterns in JAVA indicates that nano-patterns occurs in two thirds of methods, about half of the statements, third of conditional statements and 90% of all iterative statements, in the GilLalouche corpus [11]. ACKNOWLEDGMENT Inspiring discussions with Tomer Levy are gratefully and intentionally mentioned. The authors are glad to thank the anonymous reviewers for their valuable suggestions. This research was supported by THE ISRAEL SCIENCE FOUNDATION (grant No. 1803/13 *) R EFERENCES [1] F. Batarseh. Java nano patterns: A set of reusable objects. In Proceedings of the 48th Annual Southeast Regional Conference, ACM SE ’10, pages 60:1–60:4, New York, NY, USA, 2010. ACM. [2] C. B¨ohm and G. Jacopini. Flow diagrams, turing machines and languages with only two formation rules. Commun. ACM, 9(5):366–371, may 1966. [3] G. Bracha. Pluggable type systems. In In OOPSLA04 Workshop on Revival of Dynamic Languages, 2004. [4] F. Bttner, O. Radfelder, A. Lindow, and M. Gogolla. Digging into the visitor pattern. In Proc. IEEE 16th Int. Conf. Software Engineering and Knowlege Engineering (SEKE2004). IEEE, Los Alamitos, pages 135–141, 2004.
[5] D. de Rauglaudre. Camlp4–reference manual. http://caml.inria.fr/pub/ docs/manual-camlp4/index.html, 2003. Accessed: 2016-08-26. [6] A. Deo and B. Williams. Preliminary study on assessing software defects using nano-pattern detection. In 24th International Conference on Software Engineering and Data Engineering, SEDE 2015, 2015. [7] J. R. Donaldson. Structured programming. In E. N. Yourdon, editor, Classics in Software Engineering, pages 179–185. Yourdon Press, Upper Saddle River, NJ, USA, 1979. [8] S. Erdweg, T. Rendel, C. K¨astner, and K. Ostermann. Sugarj: Librarybased syntactic language extensibility. SIGPLAN Not., 46(10):391–406, oct 2011. [9] R. A. Finkel. Advanced Programming Language Design. AddisonWesley Longman Publishing Co., Inc., Boston, MA, USA, 1995. [10] M. Flatt. Creating languages in Racket. Commun. ACM, 55(1):48–56, Jan. 2012. [11] J. Y. Gil and G. Lalouche. When do soft. complexity metrics mean nothing? —when examined out of context. Journal of Object Technology, 15(1):2:1–25, 2016. [12] J. Y. Gil and I. Maman. Micro patterns in java code. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications, OOPSLA ’05, pages 97–116, New York, NY, USA, 2005. ACM. [13] E. W. Høst and B. M. Østvold. The java programmer’s phrase book. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2009. [14] D. E. Knuth. Structured programming with Go to statements. ACM Comput. Surv., 6(4):261–301, Dec. 1974. [15] B. Meyer. EIFFEL the Language. Object-Oriented Series. Prentice-Hall, Hemel Hempstead, Hertfordshire, UK, 1992. [16] H. Mills. Mathematical foundations for structured programming. Technical report, IBM rep. FSC 72-6012, IBM Fed. Syst. Div., Gaithersburg, Md., 1972. [17] J. Palsberg and C. B. Jay. The essence of the visitor pattern. In Proc. 22nd International Computer Software and Applications Conference, COMPSAC ’98, pages 9–15, Washington, DC, USA, 1998. IEEE Computer Society. [18] T. Pati and J. H. Hill. A survey report of enhancements to the visitor software design pattern. Soft. Prac. & Exp., 44(6):699–733, June 2014. [19] J. G. Siek and W. Taha. Gradual typing for functional languages. In SCHEME and functional programming workshop, pages 81–92, 2006. [20] J. Singer, G. Brown, M. Luj´an, A. Pocock, and P. Yiapanis. Fundamental nano-patterns to characterize and classify java methods. In Electronic Notes in Theoretical Computer Science, 2010. [21] K. Z. Sultana, A. Deo, and B. J. Williams. A preliminary study examining relationships between nano-patterns and software security vulnerabilities. In 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), volume 1, pages 257–262, June 2016. [22] D. A. Watt. Programming Language Design Concepts. John Wiley & Sons, 2004. [23] N. Wirth. On the composition of well-structured programs. ACM Comput. Surv., 6(4):247–259, Dec. 1974. [24] I. E. Zmiry. L OLA: a Programming Language for Augmenting Programming Languages. Master’s thesis, Technion—Israel Institute of Technology, 2016. url: https://drive.google.com/file/d/0B3645jTHku6WZzVkMl9uVGlTQ2M/view.
451