Switches are tools used to split and join pipelines into concurrent streams of data [13] .... component reuse. ...... Finally in line 6, dJoin maps the stream that it.
Technical report , IDE0707 January 2007
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
Master’s thesis in Computer Systems Engineering Gordon Ononiwu, Twaha Mlwilo
School of Information Science, Computer and Electrical Engineering, IDE Halmstad University
Parsing a Portable Stream Programming Language
School of Information Science, Computer and Electrical Engineering, IDE Halmstad University Box 823, S-301 18 Halmstad, Sweden
January 2007
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
Compiler Frontpage figure text. ii
Preface We would like to thank Dr.Veronica Gaspes at Halmstad University who supervised our thesis and Jerker Bengtsson (PhD Student). Furthermore, we would like to thank everybody who has supported us in one way or the other during the years we were studying. Special thanks go to our parents who supported us morally and financially, even while we were working on this thesis.
iii
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
iv
Abstract Portable stream programming language (PSPL) is a language for baseband application programming on reconfigurable architectures. The first step in its development has been completed. A parser has been provided for the front end of the PSPL compiler. The syntax of the language has been fixed to allow for easy parses. The scanner and the parser where generated using automatic tools (scanner and parser generators) which rely on complex mathematical algorithms for their generation. Abstract syntax (data structures that preserve the source program so that program structure is evident) was implemented for the parser using a syntax separate from interpretation style of programming. Tests were carried out to ensure that the correct data structures were generated. The final outcome is a parser that other phases of the compiler can depend on for onward transmission of the source program in an unambiguous manner. The development of subsequent phases of the compiler will form the next logical step in the processes of transforming PSPL to a stand alone language.
v
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
vi
Abbreviations API - Application programming interface AS - Abstract syntax AST - Abstract syntax tree ASIC - Application specific integrated circuits BitvecExpression - Bitvector expression BitvecExpressionList - Bitvector expression List BitvecStatement - Bitvector statement CFG - Context free grammar Dest - Destination (Switch destination) DFA - Deterministic finite automata Dst - Destination - (Map destination) DstList - Destination list DSP - Digital signal processor Exp - Expression ExpList - Expression list FPGA - Field programmable gate array FilterDec - Filter declaration Float Literal - Float literal FM - Frequency modulator FormalList - Formal list GSM - Global system mobile HDTV - High definition Television Index expression - Index expression InitDec - Initial declaration Integer Literal - Integer literal NFA - Nondeterministic finite automata MapStm - Map statement MapStmList - Map statement list ObjectDec - Object declaration ObjectType - Object type ObjectTypeDecList - Object type declaration list ObjectTypeDec - Object type declaration PipelineDec - Pipeline declaration PSPL - Portable stream programming language ProcedureDec - Procedure declaration ProcedureDecList - Procedure declaration list SCA’s - Synchronous concurrent algorithms SDU - Synchronous deterministic unidirectional Sorc - Source (Switch source) SPS - Stream processing system Src - Source (Map source) SrcList - Source list vii
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE Stm - Statement StmList - Statement list StreamDec - Stream declaration StreamDecList - Stream declaration list Streamprogram - Stream program StreamType - Stream type SwitchDec - Switch declaration SwitchDecList - Switch declaration list SwitchStm - Switch statement SwitchStmList - Switch statement list TRIPS - Tera-op, reliable, intelligently adaptive processing system RAW - Reconfigurable architecture workstation RBS - Radio base stations VarDec - Variable declaration VarDecList - Variable declaration list 3G - Third generation
viii
LIST OF FIGURES
List of Figures 1.1
Binary stream of 1’s and 0’s in the direction of the arrow . . . . . . . . . . .
1
1.2
Compiler phases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3
The Front End . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.1
A five source, four sink SPS with three filters . . . . . . . . . . . . . . . . .
5
2.2
API-based abstraction for parallel and reconfigurable array structures [1] . .
7
2.3
A pipeline [2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.4
A splitjoin [2]
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.5
A feedbackLoop [2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.6
The Framework Structure for StreamBits [1, 3] . . . . . . . . . . . . . . . . 10
4.1
The contents hierarchy of the syntax tree and visitor packages . . . . . . . . 33
4.2
The interaction between the jacc, jflex, syntax tree and visitor packages . . . 33
4.3
Programming layout for PSPL parsing . . . . . . . . . . . . . . . . . . . . . 34
5.1
Pipeline Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2
Parse tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3
The abstract syntax tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
ix
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
x
CONTENTS
Contents Preface
iii
Abstract
v
Abbreviations
vii
1 Introduction
1
1.1
Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.2
Scope of Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3
Organization
4
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Background
5
2.1
Overview
2.2
The Stream Application Domain
. . . . . . . . . . . . . . . . . . . . . . .
6
2.3
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.3.1
Brook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.3.2
StreamIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.3.3
StreamBits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.3.4
The PSPL Prototype Specification . . . . . . . . . . . . . . . . . . 10
2.3.5
Cryptol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Approach
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
15
3.1
Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2
Syntactic analysis and tools . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.1
Scanner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.2
Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.3
Parser generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.4
Semantic actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 xi
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE 4 Results
23
4.1
The BNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2
Specification files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2.1
The JFlex specification file (Scanner generator) . . . . . . . . . . . 24
4.2.2
Lexical issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.3
Jacc specification files (Parser generator) . . . . . . . . . . . . . . . 26
4.2.4
Syntactic issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2.5
The interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3
Syntax tree package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.4
Visitor package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.5
The program components . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.6
4.5.1
StreamPrintVisitor(PrettyPrintVisitor) . . . . . . . . . . . . . . . . 32
4.5.2
Stream typed visitor (STypedVisitor) . . . . . . . . . . . . . . . . . 32
Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5 Discussion 5.1
35
Testing the parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6 Conclusion
39
References
41
A Abstract Syntax tree production rule
43
B The JFLEX Specification File (Scanner Generator)
47
C
51
The JACC Specification File (Parser Generator)
D Generic StreamTyped Visitor interface
59
E Implementation of Stream Typed Visitor
63
F implementation of visitor
77
G Portable stream processing language-backus naur form (BNF)
89
H The visitor interface
93
I
97
xii
Abstract syntax constructors
CHAPTER 1. INTRODUCTION
1
Introduction
As applications designed around some notion of streams continue to increase, there is a need to provide better software support in the form of portable languages modeled around the concept of streams of data. Using the notion of streams makes it possible for programmers to structure programs in ways that provides the compiler with enough information about parallelism, program- and data-flows, which the compiler can make use of in order to produce efficient translations to parallel machines. By streams we mean the continuous one directional flow of data in any format as shown in Fig 1.1 and streaming applications are applications that process single or multiple streams in the incoming and outgoing directions at predefined flow rates. An example of such an application can be taken from the processing that goes on in third generation (3G) radio base stations (RBS)[1], digital signal processing (DSP) in radar systems, or in a software FM radio[4].
Figure 1.1: Binary stream of 1’s and 0’s in the direction of the arrow
Streaming applications are generally compute intensive and demand real time processing of the data they receive thereby, making it imperative that processing is done in the most efficient manner. They are suitably mapped to parallel architectures where most memory operations are localised in the processing units and the notion of global variables do not exist. This is very difficult to deal with in languages that schedule tasks sequentially. Most of the high level languages in use today fall into this category e.g. C, C++ etc. They are optimized for general purpose application programming on machines that have centralized memory architectures and therefore need a great deal of difficult programming to express the parallelism that is inherent in today’s parallel architectures. Using them on parallel and reconfigurable architectures with distributed memory systems, limits the gains in efficiency, speed and lower power consumption that are expected from these architectures. These problems have made it necessary to rely on hardware implementations, like application specific integrated circuits (ASIC), field programmable gate arrays (FPGA) and special purpose digital signal processing (DSP) hardware, in areas that require industrial high performance processing. However, these come with a major setback, namely, lack of flexibility [1]. For instance, in third generation (3G) radio base stations, considering the expected long life cycles that are often required, it is not desirable to encapsulate critical functions into hardware since minor changes could require a complete removal of the hardware modules involved. Software modules make it possible to integrate added functionalities or improve 1
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE on algorithms with minimal hardware changes [1]. Therefore, to reduce cost, speed up design time and improve on the ability to provide for the customer’s various needs while reducing down time required for upgrades, there is need to use commercially available reconfigurable processors while developing compilers that can exploit the parallelism that these processors provide and at the same time avoiding performance penalties. At the Center for Research on Embedded Systems (CERES) at Halmstad University, researchers in cooperation with Ericsson AB (EAB) are working on a processing model that maps baseband applications to parallel and reconfigurable architectures [1, 3] using the stream processing model. This programming model exposes the programmer to the parallelism inherent in the language but at the same time allows the compiler to make the intelligent choices required for efficient computations. A prototype of a language that will run on these architectures has been designed known as Portable Stream Processing Language (PSPL). The PSPL compiler when completed will be able to compile suitable stream processing applications that target reconfigurable architectures using a domain specific approach.
1.1
Problem Statement
Reconfigurable processors require portable compilers that can exploit their parallelism in a most efficient manner, and at the same time, provide the programmer with a level of abstraction from the particular target architecture. Recently, a great deal of work has been done in producing such compilers. Some examples can be seen in streamIT [5] and StreamC/kernelC [6] compilers. These are already huge steps in the direction of greater stream expressiveness but, they fall short of providing efficient management of component reconfiguration parameters thereby making for a less dynamic arrangement [3]. StreamIt for example can only be used for applications that have static flow rates [2]. This is a limitation when applied to baseband application processing where there is a need for dynamic reconfiguration of parameters. For some applications, such as those which can be found in radar systems and baseband processing in RBS, there is also the need to provide type primitives and operators that can express bit-level manipulations in a decent manner [1]. The language that is proposed for this thesis (PSPL) and for which a parser will be designed is to a large extent based on StreamIt but is not identical. It provides all the advantages that are inherent in the earlier stream compilers, such as having functions implemented in filters, using pipelines to describe the flow graphs, and giving room for multiple concurrent pipelines as the need may arise, but also incorporates new ideas that have been judged to be necessary but have not been supported in streamIT [3]. The aim will be to build a program based on micro stream components that will be linked in a daisy-chain format in the form of pipelines. Our work should contribute towards the implementation of PSPL as a stand alone language. At the moment it only exits as a prototype, implemented as an object oriented framework in java. In order to do this the following tasks have to be achieved: • Designing the syntax of PSPL. There are challenges here due to the fact that the types of PSPL are parameterized and that programs are organized as stream transformations processing streams. 2
CHAPTER 1. INTRODUCTION • Designing the abstract syntax (AS), a data structure used by the rest of the compiler and other tools to process programs written in PSPL • Writing a parser to read files of text containing the PSPL programs and producing an abstract syntax representation of the program.
All this will be part of the work that will be required in producing a compiler for the language.
1.2
Scope of Work
A compiler is a large and complex software machinery that uses advanced algorithms to translate a high level language into a form of machine code that can be understood by a processor. The compiler has to distinguish between correct and incorrect codes of a language, generate an intermediate representation for the language (IR), generate correct machine codes and organize memory for the variables used. Modern compilers are organized in several phases that operate on successive abstract products from one phase to the next. This phase like structure is done to make the complex program modular. The several phases can be grouped into two major sections - the front end of the compiler and the back end of the compiler [7]. See Fig 1.2. The frontend of the compiler analyses the source code, rejects the meaningless portions while at the same time obtaining a structured representation of the code. The backend carries out the transformation of the structured program into executable code.
Figure 1.2: Compiler phases
The front end can be further broken down into a scanner, a parser, a context analyzer (including a type checker) and a translator. All these phases will have different outputs that will be fed to successive phases one after the other. The scanner, otherwise known as the lexical analyzer, accepts the source code which is a sequence of characters and transforms it into a sequence of tokens which is then fed to the parser. The parser, also known as the syntax analyzer, transforms this sequence of tokens into an abstract syntax (AS) representation of the source code. The context analyzer type checks the AS while the translator transforms the AS into an intermediate representation (IR) of the source code. See fig 1.3. Our work will be limited to designing and implementing the first two phases of the compiler namely, the scanner and the parser as well as fixing the syntax of the language (rules). 3
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
Figure 1.3: The Front End
1.3
Organization
This thesis is organized as follows. Chapter 2 provided a background to information relating to streams and stream processing systems and work that had been done in developing stream languages. Chapter 3 discussed the approach that was used to arrive at the goals set out in the problem statement. Chapter 4 described the work done. Chapter 5 evaluated and discussed the results and chapter 6 discussed the recommendations that can be applied to future work.
4
CHAPTER 2. BACKGROUND
2 2.1
Background Overview
The history of streams and stream processing is centered on the notion of a stream processing system (SPS). An SPS is any collection of modules or blocks that compute in parallel, and communicates data via channels. A typical SPS module is usually divided into three classes: sources that pass data into systems; filters that perform atomic computation; and sinks that pass data out of the system. Examples of SPS are dataflow systems, reactive systems, specialized functional and logic programming with streams, synchronous concurrent algorithms, signal processing systems, and certain classes of real-time systems. An example of a five source, four sink SPS with three filters is shown in Fig 2.1.
Figure 2.1: A five source, four sink SPS with three filters
A stream transformer (ST) is defined as an abstract system that takes a set of x streams as input and produces a set of y streams as output [8]. Mathematically, an ST is described as a function: Φ : [T → A]x → [T → A]y where x, y ≥ 1 SPS can be considered as a parallel implementation of an ST specification, and stream processing can be defined as the study of both STs and SPSs [8]. Research into this branch of computer science began in the early 1960s with the study of data flow analysis used to evaluate potential concurrency in computations. In 1974, the first data flow language, Lucid, was conceived [8]. Examples of other data flow languages 5
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE that came thereafter are LUSTRE [9], SISAL [?], LAPSE [10] and MAD [11]. In 1985 the first paper on synchronous concurrent algorithms (SCAs), and reactive systems was released. Reactive systems, together with signal processing networks and synchronous dataflow networks have been the stimuli for a large body of research into stream processing. The 1980s also saw the use of stream processing in hardware design. A language, Daisy, was used in applicative stream processing for the design and synthesis of hardware. In the 90s, SCAs and reactive systems continued to be an intensive area of research. The 90s also saw the concentration of research into the study of the compositional properties of STs and into further developing the theoretical foundations of stream processing [8]. SPSs can be classified using three broad charateristics: • Either synchronous or asynchronous: this means that filters can either compute in synchronously vis-`a-vis other filters or asynchronously • Deterministic or Non-deterministic. • Uni-directional or bi-directional channel. At this point it will be important to note that we will only be concerned with synchronous, deterministic SPSs with unidirectional channels (SDU-SPS).
2.2
The Stream Application Domain
A lot of applications make use of a stream abstraction, ranging from embedded applications for hand-held computers, cell phones, digital signal processors (DSP’s), to high performance applications such as intelligent software routers, GSM base stations and HDTV editing consoles [2] and consumer desktops. As array structured parallel and reconfigurable processors become the widely used implementation strategy for computationally demanding high performance applications, the variety in the applications using the stream paradigm will increase. This calls for an efficient approach to handling component and architecture abstraction through the use of an application programming interface (API). See Fig 2.2. There is a need to develop language structures that can decompose stream applications to the abstract components in a portable and efficient manner.
2.3
Related Work
A lot of work has been done developing languages that incorporate the notion of streams. Examples are BrookC [12], StreamIT [2, 5, 4], StreamBits [1, 3] (an earlier version of PSPL) and PSPL [13]. They all have one thing in common and that is their domain specific programming approach. We also looked at cryptol [14], a cryptographic language which has well defined constructs for bit level manipulations. These are briefly outlined in the following sections. 6
CHAPTER 2. BACKGROUND
Figure 2.2: API-based abstraction for parallel and reconfigurable array structures [1]
2.3.1 Brook Brook is an extension of the C language. It has been used to accommodate constructs of parallel data computing and arithmetic intensity. In this language, the memory operations and computations kernels are separated [12]. The computations kernel only allows access to local data. The memory is accessed by a special stream construct. It is designed for a wider range of the stream architectures such as Imagine, TRIPS (The Tera-op, Reliable, Intelligently adaptive Processing System) and Reconfigurable Architecture Workstation (RAW).
2.3.2 StreamIT We provide only a brief overview of the StreamIt language. For a detailed description please see [2, 4, 5]. The StreamIt language as it is currently implemented makes use of legal java syntax. It is a portable language for high performance signal processing applications. It was designed for communication exposed machines such as RAW. The basic construct of the language is made up of filters, splitJoins, Feedback loops, and pipelines. Fig 2.3 a shows a pipeline.
Figure 2.3: A pipeline [2]
The filter is the computation unit and it has a single input and a single output. It has an init function which is called at the initialization stage and a work function which performs 7
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE the computations when a filter is in steady state. It also consists of a prework function which is executed once between the initialization and steady state. The pipeline is the basic construct for composite filters in a network. A splitJoin is needed when there is need to split a stream severally into independent parallel streams and merge them into a common joiner. See Fig 2.4 The two types of splitter are Duplicate and RoundRobin. The setSplitter and setJoiner commands are used to specify splitter and joiner types.
Figure 2.4: A splitjoin [2]
The feedbackLoop creates circles in the stream graph [2]. See Fig 2.5
Figure 2.5: A feedbackLoop [2]
The major limitations of StreamIt are that it has been designed for one dimensional data processing; this may not be optimal for constructs that require hierarchical frames of data e.g. Image processing [2]. Also, the current version of StreamIt does not have support for 8
CHAPTER 2. BACKGROUND dynamically varying I/O rates of the filters in the streams [2]. The designers of StreamIT are planning further extensions of the language to provide for these.
2.3.3 StreamBits StreamBits is a prototype language for baseband application programming [1, 3]. This prototype was designed because of the need to meet up with some of the short comings of earlier stream languages like StreamIt. Therefore, it was aimed at;
• Providing program structures that are natural to use for the definition of abstract components. • Offering primitive types and operators that allow the programmer to efficiently express application specific computations. • Expressing bit-level and data-parallel operations more efficiently without considering machine specific details in algorithmic implementation.
Like the languages described in sections 2.3.1 and 2.3.2 it is a domain specific programming language which implements the synchronous deterministic unidirectional SPS model (SDU-SPS). It is implemented as a framework in java [3]. See Figure 2.6. The primary structure of the language is to a large extent based on the StreamIt language structure, but it is not considered by the designers to be identical to StreamIt [3]. The basic stream language constructs are the filter and pipeline. The filter construct is where the computations are performed, while the pipeline construct is used to organize filters and other stream components in a daisy chain format. Components are added by the add (component) command. Dual tapes of streams are supported in this language unlike in the StreamIt construct, where components can only operate on a single Stream tape. One tape consists of a data Stream while the other tape is for the configuration Stream. This tape is used to pass stream reconfiguration parameters to components through out the application [3]. The filter construct has three execution modes init, work and configure. The init mode is executed only once before the first firing of the filter. The work mode is where computations are preformed when the filter is working in steady state. With the configure mode, a new feature implemented by the prototype which is not implemented in streamIt language, and a separate configuration tape, a great deal more flexibility has been introduced into the programming of parameter configuration of algorithms [3]. The prototype implements new language Stream types and structures which are not supported by streamIt. Such as bitVectors, for bit field of length n and vectors for fine grained data parallel types defined in a filter. We refer the reader to [3] for a more detailed review of the prototype. 9
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
Figure 2.6: The Framework Structure for StreamBits [1, 3]
2.3.4 The PSPL Prototype Specification The PSPL prototype is a more recent version of the StreamBits prototype. With PSPL, improvements have been made by introducing constructs that allow for greater flexibility at bit level manipulation and stream mapping. A complete PSPL program consists of a main function (sequential section implemented using a language subset based on C) and a stream program (parallel data flow section). The main function is used for control and provides functions for interaction with the hardware such as memory management, I/O, peripheral devices etc. The stream program provides the constructs for stream manipulation and mapping and implements the processing of compute intensive kernel functions. The main function and the stream program must be defined inside a program declaration [13]. The Stream program implements the structure of a directed flow graph. Vertices in the graph correspond to filters that perform some computation. As in StreamBits, the filters are connected together in pipelines and with an added functionality namely, a switch. Switches are tools used to split and join pipelines into concurrent streams of data [13] and provide mapping functions for more transparent stream manipulations. These splits 10
CHAPTER 2. BACKGROUND and joins can be combined to form feedback and stream forwarding loops. The following sections will give a brief insight into these stream components.
2.3.4.1
Filter
This is the atomic unit of computation. It is defined using the filter keyword followed by an identifier: filter id (arguments) {/* variables and statements */} Each filter is connected to two stream tapes: one for data and one for control, and produces two stream tapes: one for data and one for control information. Each stream tape is defined with a constant stream rate and a type and these can not be changed at run time. The stream interface for a filter is defined using a port keyword and can not be omitted. A filter has multiple modes of execution defined by init and process statements. There can be only one init statement in a filter but several process statements as the need may arise. The following is a filter that has declared two streams within its port statement and uses a temporary memory address to read data from the data stream into an output data stream, dout.Example of filter declaration 2.1 Listing 2.1: Filter example. 1 2 3 4 5
f i l t e r CRC( ) { port { dstream 3 , 0 int t o 1 f l o a t ; r a t e 4 and t y p e i n t
6 7 8
cstream 2 , 0 byte t o 2 byte ; rate 2 , type byte
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
// i n p u t d a t a stream o f // o u t p u t d a t a stream o f r a t e 1 and t y p e // f l o a t // i n p u t c o n t r o l stream o f // o u t p u t c o n t r o l stream of rate 2 , type // b y t e
} int tmp ; p r o c e s s a ( dstream ) { tmp = din ; }
// Read f i r s t item
p r o c e s s b ( dstream ) { tmp += din ; tmp += din ; dout = tmp ;
// Read second item // Read t h i r d item
} }
11
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE 2.3.4.2
The Switch
Switches are stream objects which are used for the rearrangement of stream flow.It is defined using the switch keyword followed by an identifier: switch id (arguments) {/* port delaration and Switch map statements */}.Typically switches achieve the following results when applied to stream flow: • Splitting • Joining • Feed forward loops • Feed back loops Like all stream objects, a switch definition must include a port definition. The following is a switch that has declared two data streams and two control streams. Listing 2.2: Switch example. 1 2 3 4 5 6 7 8 9 10 11 12
switch A( int n ) { port { dstream dstream cstream cstream
10 int a ; 5 int b [ 2 ] ; 2 int c ; 2 int d ;
// // // //
S i n g l e stream M u l t i p l e streams S i n g l e stream S i n g l e stream
} split { /∗ Map s t a t e m e n t s d e p e n d i n g on t h e p o l i c y d e s i r e d ∗/
13 14 15 16
} }
There are only two types of switches, splits and joins. Depending on the arrangement we can either achieve a feed back loop or a feed forward 2.3.4.3
Pipeline
The pipeline is used to encapsulate a network of stream objects. However, unlike the filter, it performs no computations itself. Like other stream objects, it must have a port definition with stream rates and types. Also, only one data stream and one control stream can be connected to a pipeline. It is defined using the pipeline keyword followed by an identifier: pipeline id (arguments) {/* port declarations,object Type and map statements declaration */}. The following is a pipeline that has three stream objects declared within its body namely, filters F1 and F3 and pipeline P. It uses the din and cin keywords to map a data stream and a control stream to F1. F1 maps these streams to P and P maps the streams to F3. The output statements dout and cout are keywords that provides the streams with an exit from the pipeline. 12
CHAPTER 2. BACKGROUND
1 2 3 4 5 6 7 8 9 10 11
Listing 2.3: Pipeline example. P i p e l i n e A( int x , int y ) { Port { /∗ Stream d e c l a r a t i o n s ∗/ } /∗ stream o b j e c t s d e c l a r a t i o n s ∗/ // map s t a t e m e n t d e c l a r a t i o n s map { din , cin => F1 ; F1 => P ; P => F3 ; F3 => dout , cout ; } }
The map statements are used to connect streams between stream objects in the pipeline and the keyword for the map block is map. For a more detailed review of the Prototype, we refer the reader to [13].
2.3.5 Cryptol Cryptol, a domain specific language for cryptographic applications was developed at Galois connections, Inc. in consultation with expert cryptographers at America’s National Security Agency. Cryptol is high level language for cryptography being that it expresses cryptographic concepts in a formal but portable manner. Cryptography is the algorithmic manipulation of data and cryptol is involved in the manipulation of sequences of finite or infinte data. In cryptol, words, blocks and streams are all treated as sequences of data. This approach enhances portability since the definition of a sequence of data is independent of the actual size of any particular number of bits. If an algorithm requires 70-bits words, the cryptol specification will use 70-bit words not minding the eventual architecture on which the program will run. This makes it easier for the programmer to write the specification and to change the algorithm as the need may arise and to experiment with different word or block sizes [14]. This independence in sequence size leads to uniformity in the primitives used to manipulate the sequences. The primitives will manipulate data not minding if they were presented as words, blocks of data, or streams. Sequence size independence also leads to uniform control structures of the language. Control at all levels of data is expressed using sequence comprehensions and combining this with recursive definitions allow us to express the sophisticated recurrence relations that appear in cryptographic algorithms [14]. All this will make for a great deal of confusion where it not for the powerful type checking that goes on in the cryptol compiler. The type checking keeps track of the various word sizes and sequences, making sure those incompatible types are not mixed at any point in the program design [14]. We have studied cryptol to be able to understand how specific primitives and operators for data parallel and bit level manipulations can function, thereby providing for improved machine level abstraction and language portability. For more detailed information on cryptol we direct the reader to [14].
13
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
14
CHAPTER 3. APPROACH
3
Approach
This chapter describes the techniques that will be followed in generating an abstract syntax representation of the PSPL programming language. It is the first step towards translating the language into executable code. As was outlined in section 1.2, the compiler is made up of several phases and these phases are all implemented as software modules that must be properly interfaced in order to produce the desired outcomes, which are translating code written in the PSPL language into a format that will be executed by the processor while at the same time checking for possible errors and reporting them back to the programmer. See table 3.1 Table 3.1: Description of compiler phases [7]
Compiler Phase Scanner Parse Semantic Actions Semantic Analysis
Translate
Canonicalize Instruction Selection Control Flow Analysis
Dataflow Analysis Register Allocation Code Emission
Description Break the source file into individual words or tokens Analyze the phrase structure of the program Construct the abstract syntax tree from concrete classes Determines what each phras means, relates uses of variables to their definitions, checks types of expressions and requests translation of each phrase. Generates the intermediate representation trees (IR trees), which are independent of any particular source language or target machine architecture. Hoist side effects out of expressions, and cleans up conditional branches, for the convenience of the next phases. Group the IR-tree nodes in ways that correspond to the actions of the target-machine instructions.. Analyzes the sequence of instructions into control flow graphs that show all the possible flows of control which the program might follow when executing. Gather information about the flow of information through variables of the program. Choose a register to hold each of the variables and temporary values used by the program. Replaces the temporary names in each machine instruction with machine registers.
The interface between these modules is as important as the algorithms that are implemented inside them and to be able to describe these interfaces concretely, it is important to write them in a programming language. For this project we choose java, an objectoriented language, because it is safe, meaning that it has strict type rules, and it has garbage collection which means that it can dynamically recover memory from variables that are no longer in use while the program is running. It is important to note that this modular approach to compiler design is done to allow for component reuse. Changes made to one module will not affect the way the other modules 15
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE operate. In the following sections we outline all the steps that will be taken to achieve the desired goal, which is to design a parser for PSPL.
3.1
Grammars
A language is a meaningful set of sentences, each sentence comprising of a sequence of words, each word comprising of a sequence of symbols or characters. Every formal language has its grammar and we can define the grammar as a set of rules that must be followed in order to make distinctions between correct and incorrect sentences. Therefore, we say that the grammar describes the language. For a programming language there is the added requirement that the grammar has to be context free in order to allow for easy parsers. Such languages are usually processed by compilers to ensure that they meet up with these strict rules; therefore, we say that they are formal languages. PSPL is one such formal language and the rules of the PSPL are described as its syntax. The syntax is like the phrase of the language and we will use regular expressions to describe the words which contribute towards forming the syntax. The context free grammar (CFG) of a language is a set of rules describing how to form sentences in the language. The CFG consists of four parts T, NT, S, and P. • T, the set of terminal symbols representing words of the language (tokens). • NT, the set of non-terminal symbols (syntactic categories e.g. the types of sentences or sentence category). • S, the start symbol or Goal which is a non-terminal standing for the syntactic category whose sentences we are describing. • P, the set of productions. Each non-terminal is mapped to a production If we have a rule like: AB → aba AB |aba | ; AB → aba AB, is the production (P ) and derives sentences built by the word aba followed by another AB which could be an aba or just an empty entity. AB is the nonterminal (NT ) while aba is the terminal (T ). The start symbol (S ) is AB. We also introduce additional rules in order to reduce ambiguity and have one parse tree per given sentence. This is done to ensure that the writer will be sure of how the compiler interprets the sentence and it also reduces the need for too many parentheses. Two conventions are introduced: • Associativity 16
CHAPTER 3. APPROACH • Precedence All these rules and conventions will be combined by the parser to build a parse tree that recognizes the language of the grammar. The Backus-Naur Form (BNF) is one of the styles used to note the CFG of a language. Another style is called the Van Wijngaarden Form. For this project, we will be using the BNF notation style. This form uses angle brackets () to enclose non-terminals and ::= and | for the possible productions of a non terminal. For instance:
::= aba | |
Shows that AB is a non terminal and the possible products of the rule are aba and another AB.
3.2
Syntactic analysis and tools
Scanners and parsers are the first two parts of a compiler. They are software modules that are very tedious to write with many cases to take care of and many possibilities for error. That is why we need tools to be able to generate them automatically. In the following sections we will be discussing scanners, scanner generators, parsers and parser generators.
3.2.1 Scanner A scanner is a program that scans the sequence of characters presented to the compiler identifying the words of the language. Such words can comprise of the keywords, identifiers, punctuations, operators etc. Separators like blank spaces between the words (often called white spaces), comments and newlines are discarded. It would further complicate the parser if white spaces and comments where to be accounted for and this is the main reason why there is a separation between both phases [7]. In a compiler the scanner does the following things: • Takes in a sequence of characters at its input and presents a sequence of tokens at its output thereby facilitating the analysis of the phrases of the program. • Interrupts the compilation process when there is a lexicographic error and reports the error and its position back to the programmer. • Uses regular expressions to recognize the words of a sentence. Regular expressions specify some of the possibly infinite set of languages with finite descriptions. Programs that generate scanners automatically are known as scanner generators. They take in regular expressions; apply complex algorithms to convert them first to Nondeterministic Finite Automata (NFA), and finally to Deterministic Finite Automata (DFA). 17
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE It is from the DFA that the words of the language are picked out since every regular expression has a DFA that recognizes its language. For more on this we refer the reader to [7]. JFlex [15] and JavCC [7] are scanner generators that generate lexical analyzers written in java. For the project we have decided on using JFlex because it is based on java and we are familiar with it.
3.2.2 Parser A parser is the second phase of the compiler and is built from the CFG. The parser takes in the set tokens generated by the scanner and matches it to the CFG, evaluates the matching attributes, and generates an abstract representation of the grammar. The question is how does a parser go about doing all this? To answer this we need to understand how the parser works. The symbols appear to the parser as a sequence of tokens, some of them with values attached to them and it is the job of the parser to build a parse tree based on what it gets. The parse tree is generated by connecting each symbol to the one that derived it based on the rules of the grammar. The parser is able to do this using either of two strategies: • Top-Down • Bottom-Up In the Top-Down strategy the parser starts with the start symbol and tries to match the input. The parser examines the input from left to right in one pass, applying a recursive decent technique which expands the non-terminals (starting at the start symbol) into their equivalent right-hand side symbols until the input is matched. Given the grammar: bexp::=bexp|conj |conj conj::=conj&neg |neg neg::=atom atom::=True |False |ID |(bexp) Given the input: ID | TRUE & ID A Top-Down parse will look like
bexp bexp 18
| conj
CHAPTER 3. APPROACH bexp conj neg atom ID
| | | | |
conj neg atom TRUE TRUE
& & & & &
neg atom ID ID ID
An example of an application of this strategy is in the LL (k) parser. LL (k) stands for left-to right parse, leftmost derivation, k (an integer number) - symbol/token look ahead [7]. LL(1) parsers are generated automatically but simple variants can be written, applying the recursive-decent technique. In the Bottom-Up strategy the reverse is the case. The parser examines the input, from left to right, and tries to reduce it to the start symbol. Two aspects which are of great importance to us here are:
• Shift the next token from the input. • Reduce some symbols from the input to a non-terminal according to a grammar rule. For the input given above the Bottom-Up parse can look like bexp bexp bexp bexp bexp bexp bexp bexp conj neg atom ID
| | | | | | | | | | |
conj conj conj conj neg atom TRUE TRUE TRUE TRUE TRUE
& & & & & & & & & &
neg atom ID ID ID ID ID ID ID ID
The parser builds a frontier in the parse tree by either taking one more tokens from the input sequence or reducing the sequence of tokens already consumed into a terminal based on one of the rules of the grammar. Consequently starting from ID the frontier reduces the ID to an atom and from an atom to a neg until it gets to a bexp. While this goes on, it keeps looking ahead to the next token and makes decisions based on what it sees without first consuming the token. An example of the application of the Bottom-Up strategy is in the LR (k) parser. LR (k) stands for left-to-right parse, rightmost-derivation, k -token look ahead [7]. It is more powerful than the LL (k) technique. Unlike the LL (k) where the parser must predict which production to use having seen only the first tokens of the right hand side, the LR (k) parser can postpone the decision of which rule to use until it has seen the tokens which match the entire right-hand-side of the production and even beyond. 19
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
3.2.3 Parser generator There are many parser generators that generate parsers written in java, some of the well known ones are CUP, Antlr, JavaCC , SableCC, Coco/R, BYACC/Java, Jikes, and Jacc parser generators [16]. For this project we will be using Jacc. We are interested in the following (but not limited to these) differences between Jacc and the other parser generators mentioned above; • It a pure Java implementation that is portable and runs on many Java platforms. • It generates a bottom-up/shift reduce parser with disambiguating rules. • Modest additions to help users understand and debug generated parsers, including: HTML outputs, and tests for conflicts. The basic structure of the Jacc input file is as follows; Directives Section %% Rules Section %% Additional Code Section
The Directives section can be used to customize aspects of the generated Java parser. It is here that the interface between lexical analysis and parsing is specified and also descriptions of properties of terminals and non terminals symbols in the input grammar. The Rules section, specifies the CFG for the language that the generated parser is to recognize. Also, it associates each production with a fragment of code called the semantic action that the parser will execute each time that the production is reduced. The semantic actions access the semantic values corresponding to each terminal and are either used to construct the parse tree or to perform other computations as may be required. The Additional code section provides code that will be copied into the body of the parser class. Jacc does not make any attempt to check if this code is valid. Any syntax error in this part will only be discovered when compiling the parser [16].
3.2.4 Semantic actions A compiler must do something useful after it has recognized that a sentence belongs to the language of a grammar. This is where semantic actions come into play. Each terminal and non terminal may be associated with its own semantic value and each non terminal with its own syntactic category. For instance, for the rule: AB → aba AB The semantic actions must return a value whose type is associated with AB. To do this, it needs an abstract syntax representation of the parsed sections of the phrase. This 20
CHAPTER 3. APPROACH provides the parser with data structures that can be used to store source code attributes in a way that ensures that the phrase structure of the source code is evident. This works recursively until we eventually get a value that represents the start symbol or non terminal AB. The compiler uses the abstract syntax to build a parse tree with the start symbol as its root, the non terminals as the nodes of its branches, and the terminals as its leaf’s. The compiler will need concrete classes that will be used to construct the parse tree from the abstract syntax. We refer to this parse tree (which is a data structure in java) as the abstract syntax tree (AST). The AST data structure can now be parsed to the semantic analysis phase of the complier for further analysis without bothering about the ambiguities in the grammar. The AST serves as a huge interface between the parser and the later phases of the compiler In presenting the concrete classes that will be used to construct the AST, we will use a syntax separate from interpretation style of programming which has been fully explained in [7]. Each of the abstract syntax classes is expected to have a constructor for building syntax trees and an evaluation method that will return the value of the expression. In the syntax tree separate from interpretation style, the code for the evaluation method is not defined in the abstract syntax class; rather it resides in separate structures called the Visitors. The visitor class implements the interpretation of the evaluation, providing the abstract class with a visit method which passes control to an appropriate method of the visitor which implements the specific interpretation of the required evaluation. With the Visitor pattern it is easy to add new interpretations to the abstract classes.
21
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
22
CHAPTER 4. RESULTS
4
Results
The implementation of the PSPL parser involved the design of the BNF, fixing the syntax for PSPL, the implementation of the specification files for both JFlex and Jacc which resulted in a scanner and a parser, and the design of the abstract classes that will be used by the parser to generate an AST for any program written in PSPL. These abstract classes are contained in two packages; syntaxtree and visitor . The following sections will outline some of the results.
4.1
The BNF
The PSPL BNF presents the CFG and an abridged version is outlined below while the rest of the BNF can be found in the appendix section G Goal ObjectDec
PipelineDec
::= ::= | | ::=
ObjectTypeDecList ObjectTypeDec ObjectTypeDecRest FilterDec
::= ::= ::= ::=
InitDec ProcedureDecList ProcedureDec ProcedureDecRest StreamDecList StreamDec
::= ::= ::= ::= ::= ::= | | | ::= ::= | |
StreamDecRest StreamType
ObjectDec PipelineDec FilterDec SwitchDec pipeline id’(’FormalList’)’ ’{’port’{’StreamDecList’}’ ObjectTypeDecList map’{’MapStmList’}’ ’}’ ObjectTypeDec ObjectTypeDecRest Type id’(’ExpList’)’ ’;’ObjectTypeDec filter id’(’FormalList’)’ ’{’port ’{’StreamDecList’}’ VarDecList InitDec ProcedureDecList’}’ init’{’StmList ’}’ ProcedureDec ProcedureDecRest procedure id’(’StreamType’) {’ StmList ’}’ ProcedureDec StreamDec StreamDecRest StreamType ExpList Type StreamType ExpList Type id ’to’ expr Type StreamType ExpList Type ’to’ expr Type StreamType ExpList Type id ’;’StreamDec dstream cstream void
The BNF shows that the goal of any PSPL program is to be an ObjectDec. This is specified in the very first rule of the CFG. Goal ::= ObjectDec 23
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE The ObjectDec can either be a pipeline, filter or a switch. These three are the only components that can stand alone in a PSPL program. ObjectDec
::= | |
PipelineDec FilterDec SwitchDec
The other components like InitDec and StreamDec must be declared inside the body of an ObjectDec. There can be only one ObjectDec per PSPL file.
4.2
Specification files
These are files that generate the scanner and the parser automatically. The following subsections outline the specification files that have been used and the lexical and syntactic issues that have been overcome in order to have a working scanner and parser respectively.
4.2.1 The JFlex specification file (Scanner generator) The specification file for JFlex, pspl.flex, will generate the Scanner for PSPL. An abridged version is presented bellow as follows: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
24
Listing 4.1: JFlex specification. /∗ /∗ ∗ G e n e r a t i n g L e x i c a l A n a l y z e r f o r P o r t a b l e Stream Language Program . ∗ Master T h e s i s P r o j e c t ∗ ∗/ %% /∗ D i r e c t i v e s and d e f i n i t i o n s ∗/ // d i r e c t i v e s : %debug /∗The name o f t h e l e x e r c l a s s ∗/ %c l a s s psplLexer /∗ The l e x e r c l a s s implement t h i s a l l integer for the tokens ∗/ %implements psplTokens /∗ t h e t y p e used f o r t h e t o k e n s ∗/ %int %u n i c o d e /∗To e n a b l e cha r c o u n t i n g ∗/ %char /∗To e n a b l e l i n e c o u n t i n g ∗/
i n t e r f a c e which c o n s i s t s
CHAPTER 4. RESULTS 30 %l i n e 31 32 33 /∗ To e n a b l e column c o u n t i n g ∗/ 34 %column 35 36 // code i n P s p l L e x e r : 37 %{ 38 Object s e m a n t i c V a l u e ; 39 /∗ Methods i m p l e m e n t a t i o n f o r l i n e and column c o u n t i n g where s y n t a c t i c e r r o r o c c u r s ∗/ 40 41 public int l i n e n r ( ) { 42 return y y l i n e ; 43 44 } 45 public int columnnr ( ) { 46 return yycolumn ; 47 } 48 49 /∗The t o k e n t y p e . 50 51 ∗/ 52 int token ; 53 %} 54 55 56 // d e f i n i t i o n s :
The complete JFlex implementation can be seen in the appendix section B.
4.2.2 Lexical issues The first part of the Jflex specification (before the first %%) is copied into the java file before the class is declared. The second part, the directives and definitions section, is where Jflex options have been used to customize the generated scanner. We also declare macros that are used in the rules section of the specification file. These macros use regular expressions to describe the words of the language. Some of the regular expressions we have used are described as follows: • Identifiers: A sequence of letters, digits, and underscores, starting with a letter. Lower case characters are distinguished from uppercase characters • Integer Literals: A sequence of numbers without an exponential part; it includes decimal integer literals, hexadecimal integer literals and octal integer literals • Float Literals: A sequence of numbers with decimal digits or an exponential part. • Comments: Start with /* and end with */, /** and end **/ and // The options are a set of Jflex commands that customize the generated lexer. Some options that have been used and their description are: 25
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE • %debug: Includes a main in the resulting lexer • %class PSPLLexer: Gives the lexer a name (PSPLLexer ) • %implements psplTokens: Specifies the interface that will be implemented by the lexer. This interface will be supplied by the parser. • %int: Specifies the type used for the tokens • %unicode: Specifies the encoding of chars in the source files • %line: Enabls line counting. Used for locating errors. • %column: Enabls column counting. Used for locating errors • %function next: Forces the scan method to get the specified name. If this option is not used the scan method will get the name yylex().
4.2.3 Jacc specification files (Parser generator) The specification file for Jacc, pspl.jacc, will generate the parser for PSPL. It is presented bellow as follows: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
26
/∗ ∗ ∗ ∗ ∗ ∗/
Jacc s p e c i f i c a t i o n f i l e a p a r s e r f o r P o r t a b l e Stream Programming Language ( Year 2006)
%{ /∗ ∗ import t h e a b s t r a c t s y n t a x c l a s s e s . ∗/ import s y n t a x t r e e . ∗ ; %} %next %g e t %s e m a n t i c
nextToken ( ) l e x e r . token Object : l e x e r . s e m a n t i c V a l u e
/∗ ∗ Defining terminals with types ∗/ %token ID %token INTEGER LIT %token FLOAT LIT %token TRUE FALSE IF ELSE %token WHILE DIN CIN COUT DOUT SCHEDULE FOR NULL %token STREAMPROG ERROR RETURN BOOLEAN SCHEDULE MAP SPLIT PROGRAM WORK UMINUS
CHAPTER 4. RESULTS 32 %token PARALLELTOP SHIFTLEFTOP SHIFTRIGHTOP INCREMENTSIGN DECREMENTSIGN AND COLON 33 %token ADDASSIGNOP ARRELEMENTMULT ARRELEMENTDIV ARRAYTRANSPOSE MATRIXTRANSPOSE 34 %token INT FLOAT BYTE BOOLEAN BITVEC VOID MAPOP SWITCH CONCOP PORT JOIN SPLIT 35 %token LOOP SEQ FILTER INIT PIPELINE TO PROCEDURE DSTREAM CSTREAM NOTEQSIGN AND 36 %token MAP FLOAT LIT CONSECEQSIGN LOGICOP NOTEQSIGN AND BREAK ARRELEMENTADD ARRELEMENTSUB 37 %token ’+ ’ ’− ’ ’ ∗ ’ ’ / ’ ’%’ 38 %token ’ ( ’ ’ ) ’ ’= ’ ’ , ’ 39 %token ’ [ ’ ’ ] ’ ’ ˆ ’ ’ ˜ ’ 40 %token ’& ’ ’ | ’ ’< ’ ’> ’ ’#’ 41 %token ’ { ’ ’ } ’ ’ ( ’ ’ ) ’ ’ ; ’ ’ . ’ ’ ! ’ ’ ? ’ 42 43 /∗ 44 ∗ Precedence and a s s o c i a t i v i t y o f o p e r a t o r s 45 ∗/ 46 47 %l e f t AND 48 %n o n a s s o c ’< ’ 49 %l e f t ’+ ’ ’− ’ 50 %l e f t ’ ∗ ’ 51 %l e f t UMINUS 52 %% 53 /∗Grammar r u l e w i t h a c t i o n s t o g e n e r a t e a b s t r a c t s y n t a x t r e e ∗/
The full version complete with its interface is shown in the appendix C
4.2.4 Syntactic issues As described in subsection 3.2.3, a typical jacc specification file is demarcated into three sections, namely: A directives section, a rules section and an additional codes section. The directives section, which appears just before the first %%, is where we specify the instructions that are used by the jacc file to customize the generated java file, specify the token interface, and describe the terminals and non terminal symbols in the source file. The following customizations are made in our jacc file: • %{import syntaxtree.*;%}: Specifies that the syntaxtree package be imported into the generated java file. This and any other code included in the braces are included at the beginning of the java file after the package declaration. • %class psplParser: This specifies the name of the generated java file. • %interface psplTokens: This specifies the name of the token interface between the generated parser and the scanner. This interface keeps a record of the numeric values for input tokens. • %next nextToken(): The next directive specifies the method that is used by the parser to call on the scanner to return the integer value for the next token. 27
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE • %get lexer.token: The get directive specifies the code sequence that is used to get the integer code for the current token without moving the scanner to a new token. • %semantic Object: lexer.semanticValue: The semantic directive specifies the type of semantic values that are passed as tokens from the scanner or constructed while parsing when reduce actions are executed. • %token STREAMPROG ERROR RETURN BOOLEAN SCHEDULE MAP SPLIT PROGRAM WORK UMINUS: This is an example of a directive that we used to define the terminal symbols that have been used in the grammar. The terminals are usually specified in capital letters. • %left, % right, %nonassoc: These directives are used to declare a fixity, combination of precedence and association, for the tokens that are mentioned. For example: – %left AND – %nonassoc ’ % ++ -- -> => == += .* ./ .+ .- .’ ’ : != ^ ~ - / * = && & | > < [ ] { } ( ) ; .
,! ? +
These are further grouped into operators and unary operators as follows Operators Unary Operators
op Unop
|| > % .* ./ .+ .- = && & | > < - / * + + - ^ ~ ++ -- .’ ’ !
The terminal symbols are outlined as follows: INTEGER LIT TRUE FALSE IF ELSE WHILE DIN´CIN COUT DOUT SCHEDULE FOR STREAMPROG ERROR RETURN BOOLEAN MAP SPLIT PROGRAM WORK PARALLELTOP SHIFTLEFTOP SHIFTRIGHTOP INCREMENTSIGN DECREMENTSIGN AND COLON ADDASSIGNOP ARRELEMENTMULT ARRELEMENTDIV ARRAYTRANSPOSE MATRIXTRANSPOSE INT FLOAT BYTE BOOLEAN BITVEC VOID MAPOP SWITCH CONCOP PORT JOIN SPLIT LOOP SEQ FILTER INIT PIPELINE TO PROCEDURE DSTREAM CSTREAM NOTEQSIGN AND MAP FLOAT LIT CONSECEQSIGN LOGICOP NOTEQSIGN AND BREAK ARRELEMENTADD ARRELEMENTSUB Tests were carried out while fixing the grammar to ensure that it produced the desired effect. the tests were carried out without generating a parser file.
4.2.5 The interface It is the job of the interface to form a link between the parser and the scanner and also report errors that occur at this stage of the compiler. The code in the interface is copied into the parser class that will be generated by the parser generator.
4.3
Syntax tree package
The syntax tree package contains all the java classes that will be used by the parser to construct the AST. They all have an accept method that takes in a visitor and requests that the visitor returns its specific interpretation through the method visit.For all constructors of the members of this package,their syntactic categories and abstract syntax tree production rule we refer the reader to appendix section I and section A.
4.4
Visitor package
The members of the visitor package are used to implement the syntax separate from interpretation style of programming. The members of this package are listed below • ImplementSTypedVisitor • interface STypedVisitor 31
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE • interface Visitor • ImplementVisitor The visitor is passed as an argument to the accept method of an abstract class. It provides a visit method for each syntax-tree class. The visitor interface is outlined here: public public public public public public public public public public public public }
interface Visitor{ void visit (StreamProgram n); void visit (PipelineDec n); void visit (FilterDec n); void visit (SwitchDec n); void visit (ProcedureDec n); void visit (Formal n); void visit (InitDec n); void visit (MapStatement n); void visit (IntegerType n); void visit (FloatType n); void visit (ByteType n);
We refer the reader to the appendix sections D,E,H and Ffor the complete implementation of the members of the visitor package.
4.5
The program components
The visitor package consists of a visitor pattern designed to perform operations for the elements of the abstract syntax (data structures). Each node of the syntax tree accepts a visitor and sends a signal to the visitor asking it to return a specific interpretation of the node by passing itself as an argument when calling the visit method of the visitor class. The call made by visitor will depend on the type of the visitor and abstract syntax tree that is required. Fig 4.2 shows the contents hierarchy of the syntax tree package and visitor package
4.5.1 StreamPrintVisitor(PrettyPrintVisitor) This class implements the visitor, it is here to ensure that the generated abstract syntax tree is correct.
4.5.2 Stream typed visitor (STypedVisitor) In the abstract syntax tree, there are abstract classes (and their sub-classes) which represent types (expressions, statements etc). The sub-classes implement the abstract classes. The generic type helps to parameterize these types of data structures. STypedVisitor has a visit method that returns a value of type T. See Fig 4.2 for the interaction between the jacc and jflex files and members of the syntax tree package. 32
CHAPTER 4. RESULTS
Figure 4.1: The contents hierarchy of the syntax tree and visitor packages
Figure 4.2: The interaction between the jacc, jflex, syntax tree and visitor packages
33
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
4.6
Parsing
All the classes in the visitor package and the sytaxtree package were compiled. Also, the files psplLexer.java, psplParser.java and psplMain.java were also compiled and tested. Syntaxtree contains all the class definitions for the ASTs with each concrete subclass containing a constructor for creating an AST node. Samples of these files have been presented in the appendix sections. See Fig 4.3 for a simple overview of the programming layout.
Figure 4.3: Programming layout for PSPL parsing
The parser and the scanner are instantiated when the main program begins to run. The parser makes requests to the scanner to scan the source file one token at a time and pass the token to the parser. It then calls on the syntaxtree classes to generate the AST for the source program.
34
CHAPTER 5. DISCUSSION
5
Discussion
In this chapter we will observe what happens when a sequence of PSPL code is presented to the PSPL parser, how it gets broken down into tokens at the scanner, how it is analyzed by the parser to generate an appropriate parse tree and how the AST is constructed. We will use a sample of PSPL code to carry out this study. The following source code presents a map statement used within a pipeline to map data and control streams to objects like, filters, switches and pipelines. Listing 5.1: Map . 1 2 3 4 5 6
map{ din =>s p l i t I n p u t ; s p l i t I n p u t=>dSample , fForward ; dSample =>lowPass ; lowPass , f f o r w a r d=>dJoin ; dJoin=> dout ; }
The five stream objects that appear in the body of the code, namely, splitInput, dJoin, dSample, lowPass and fForward, have already been declared in the body of the pipeline. splitInput and dJoin were declared as switches while dSample, lowPass and fForward were declared filters. Line 1 starts with the map keyword marking the beginning of the map declaration. In line 2, the din keyword is used to map the data stream into the switch, splitInput. In line 3, splitInput splits the data stream into filters dSample and fForward respectively. In line 4, dSample maps the stream to lowPass and in line 5, lowPass and fForward both map the stream to dJoin. Finally in line 6, dJoin maps the stream that it has joined to dout. See Fig 5.1.
Let us consider what happens when the parser gets to line 3: SplitInput => dSample, fForward; The scanner scans through the line, a symbol at a time, matching the symbols to their regular expressions to determine which tokens to send to the parser. So the output of the scanner will be: ’ID’ ’MAPOP’ ’ID’ ’,’ ’ID’ ’;’ Note that the letters of the individual identifiers are not important to the parser; they only served to differentiate between one memory location and another. Some tokens come with a semantic value attached to them and this value is passed to the constructors of their abstract classes for use in later stages of the compiler. At this stage blank spaces and comments have already been discarded. The PSPL parser examines the sequence of tokens, using the bottom-up strategy, from left to right, and constructs a parse tree from the terminal nodes up to the root of the tree. The rules that will be used in parsing this segment of code are listed as follows: 35
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
Figure 5.1: Pipeline Mapping
MapStm Src Src
::= ::= ::= | | SrcRest ::= DstList ::= Dst ::= | | DstRest ::=
SrcList ’=>’ DstList Src SrcRest din cin id ’,’ Src Dst DstRest dout cout id ’,’ Dst
For the input sequence: ID MAPOP ID , ID ; The parser shifts to take in the first token which is an ID. It then looks ahead and on sighting a MAPOP token it reduces the frontier to a src. It further reduces the frontier to a SrcList before going on to take the next token. The same happens at the destination side. Finally, it arrives at the root of the tree which is a MapStm. See Fig 5.2 for the parse tree. Note that the parser generates the AST at the same time that it is constructing the parse tree, making use of the classes that are included in the syntaxtree package. An outline of the rules matched to their concrete classes is given as follows: 36
CHAPTER 5. DISCUSSION
Figure 5.2: Parse tree
MapStm ::=SrcList’=>’DstList MapStatement Src ::=din SourceMapStatementdin |cin SourceMapStatementcin |id SourceMapStatementid Dst ::=dout DestinationMapStatementdout |cout DestinationMapStatementcout |id DestinationMapStatementid
Figure 5.3: The abstract syntax tree
SrcList and DstList are constructed as arrays of Src and Dst data type respectively. See Fig 5.3 for the AST. 37
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
5.1
Testing the parser
We have tested the parser by providing the stream objects source code such as pipeline,filter and switch to the parser.The parser parsed the source code and built the required abstract syntax of the stream objects. Example of the source code for the pipeline is 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
38
Listing 5.2: Pipeline code . pipeline c o n t a i n e r ( int n1 , int n2 ) { port { dstream 10 int [ 4 ] t o 20 int ; cstream 1 int t o 1 int ; } inputF f i l t e r 1 ( n1 , n2 ) ; s o r t F f i l t e r 3 ( n2 ) ; p i p e f i l t e r 4 ( n1 , n2 ) ; map{ din , cin => f i l t e r 1 ; f i l t e r 1 =>p i p e ; p i p e=> f i l t e r 3 ; f i l t e r 3 =>dout , cout ; } }
CHAPTER 6. CONCLUSION
6
Conclusion
The major goals of the project have been achieved. A parser has been provided for PSPL. The syntax had to be fixed in order to make it easy to parse and tests where carried out to ensure that the desired outcomes were achieved. Also, abstract syntax was implemented for the parser using a syntax seperate from interpretation style of programming. The parser uses the abstract syntax to preserve source programs in concrete data structures that are used by later phases of the compiler for synthesis and translation to machine code. Tests were carried out to ensure that the correct abstract syntax trees are being generated as desired. This project is only the first but probably the most important step in taking PSPL from being a prototype framework implemented in java to being a language on its own. In the future, steps will be taken to provide the later phases of the compliler. This will not only involve providing a context analyser and a translator to intermediate representation, but will also involve providing a suitable back end for conversion of the intermediate representation to machine code. Error handling is very important in a compiler of this nature. It is very important that when there is an error the compiler is able to provide useful feedback to the programmer. In this project we have provided ways of reporting lexical and syntactic errors to the programmer. Howerver, the later stages of the compiler can only report useful errors through the imformation provided to them in the abract syntax tree. It is therefore necessary that some changes be made to the abstract syntax to enable them store the positional information of the tokens that are received by the parser. This is a very easy change to make and we agreed that it is best left for the providers of the context analyser to make these changes as they may so desire.
39
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
40
BIBLIOGRAPHY
Bibliography [1] J. Bengtsson, “Thesis for the degree of licentiate of engineering ,efficient implementation of streaming applications on processors arrays,technical report,” School of Information Science,Computer and Electrical Engineering, Halmstad University, Tech. Rep., 2006. [2] W. Thies, M. Karczmarek, M. Gordon, D. Maze, J. Wong, H. Hoffmann, M. Brown, and S. Amarasinghe, “Streamit: A compiler for streaming applications,” Feb. 21 2001. [Online]. Available: http://citeseer.ist.psu.edu/515289.html;http: //www.lcs.mit.edu/publications/pubs/pdf/MIT-LCS-TM-622.pdf [3] J. Bengtsson, “Thesis for the degree of licentiate of engineering,a configurable framework for stream programming exploration in baseband applications,” School of Information Science,Computer and Electrical Engineering, Halmstad University, Tech. Rep., 2006. [4] The StreamIt Cookbook, Laboratory for Computer Science,Massachusetts Institute of Technology,Cambridge.MA 02139, Nov. 2004. [Online]. Available: http: //cag.csail.mit.edu/streamit/index.shtml,accessed,November-11-2006. [5] W. Thies, M. Karczmarek, and S. Amarasinghe, “StreamIt: A language for streaming applications,” Lecture Notes in Computer Science, vol. 2304, pp. 179–??, 2002. [6] Stanford University. streamc. [Online]. Available: http://graphics.stanford.edu/ streamlang/streamc-3-6-00.pdf,accessed,November-11-2006. [7] A. W. Appel and J. Palsberg, Modern compiler implementation in Java, 2nd ed. pub-CAMBRIDGE:adr: Cambridge University Press, 2002. [8] R. Stephens, “A survey of stream processing,” Acta Informatica, vol. 34, no. 7, pp. 491–541, 1997. [9] P. Caspi, D. Pilaud, N. Halbwachs, and J. Plaice, “Lustre: A declarative language for programming synchronous systems,” 1987, pp. 178–188. [10] J. R. Gurd, J. R. W. Clauert, and C. C. Kirkham, “Generation of dataflow graphical object code for the lapse programming language,” in Proceedings of the Conference on Analysing Problem Classes and Programming for Parallel Computing (CONPAR ’81), ser. LNCS, W. H¨andler, Ed., vol. 111. N¨ urnberg, FRG: Springer, June 1981, pp. 155–168. [11] J. Foley, “Manchester dataflow machine: Benchmark test evaluation report,” Department of Computer Science, University of Manchester, Manchester, UK, Technical Report UMCS-89-11-1, 1989. [12] I.Buck. (2003, Oct.) Brook specification v0.2. [Online]. Available: http: //merrimac.stanford.edu/brook/brookspec-v02.pdf,accessed,October-28-2006. 41
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE [13] J. Bengtsson, Portable Stream Processing Language (PSPL) specification version 1.0, School of Information Science,Computer and Electrical Engineering, Halmstad University, 2006. [14] Cryptol. (2006, Feb.) Cryptol reference manual. [Online]. Available: //www.cryptol.net/doc.htm,accessed,October-28-2006.
http:
[15] G.Klein. (2005, July) Jflex user’s manual. [Online]. Available: http://www.jflex.de/ manual.pdf,accessed,October-28-2006. [16] M.Jones. jacc:just another compiler for java,a reference manual and user guide. Department of Computer Science ,Engineering,Stanford University, Stanford, CA 94305.
42
APPENDIX A. ABSTRACT SYNTAX TREE PRODUCTION RULE
A
Abstract Syntax tree production rule StreamProgram::=ObjectDec StreamProgram ObjectDec ::=PipelineDec PipelineDec |FilterDec FilterDec |SwitchDec SwitchDec PipelineDec ::= pipeline id’(’FormalList’)’ ’{’port ’{’StreamDecList’}’ObjectTypeDecList map’{’MapStmList’}’’}’ FilterDec ::= filter id’(’FormalList’)’ ’{’port’{’StreamDecList’}’VarDecList InitDec ProcedureDecList’}’ SwitchDec ::= ’switch’ id ’(’FormalList’)’ ’{’’port’’{’StreamDecList’}’ SwitchStms ’}’ ProcedureDec ::=procedure id ’(’StreamType’)’ ’{’StmList’}’ ProcedureDec StreamDec ::= StreamType expList Type StreamTypeDecOne |StreamType expList Type id ’to’ expr Type StreamTypeDecTwo |StreamType expList Type ’to’ expr Type StreamTypeDecThree |StreamType expList Type id StreamTypeDecFour StreamType ::= dstream DstreamType |cstream CstreamType |void VstreamType Formal ::=Type id Formal InitDec ::= init’{’StmList’}’ InitDec SwitchStms ::=’join’’{’SwitchStmList’}’ JoinSwitchStatement SwitchStm ::= loop expr’{’SwitchStm’}’ loopSwitchStatement | Src ’=>’ id SplitMapSwitchStatement | Src =>’ id expr SplitMapSwitchStmToMany | Src ’#’expr’ =>’ id SplitMapSwitchStmMult | Src ’#’expr’ => ’id expr SplitMapSwitchStmMultiple |id ’#’expr ’=>’ Dst JoinStreamsMapSwitchtoDestOne |id expr’#’expr’ =>’ Dst JoinStreamsMapSwitchtoDesttwo |id expr’ =>’ Dst JoinStreamMapSwitchtoDest |id ’#’expr’ ->’ Dst ConcStreamsMapSwitchtoDestOne |id expr ’-> ’Dst ConcStreamsMapSwitchtoDesttwo |id expr’#’expr -> Dst ConcStreamMapSwitchtoDest MapStm ::= SrcList’=>’DstList MapStatement Src ::=din SourceMapStatementdin 43
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
Dst
Type
Statement
Sorc Dest VarDec
Expression
44
|cin SourceMapStatementcin |id SourecMapStatementid ::= dout DestinationMapStatementdout |cout DestinationMapStatementcout |id DestinationMapStatementid ::= INT IntegerType |FLOAT FloatType |BYTE ByteType |BOOLEAN BooleanType |INT’[’expr’]’ IntegerArrayType |BITVEC’’’[’expr’]’ BitvecArrayType |BYTE’[’expr’]’ ByteArrayType |BITVEC’’ BitvecType |id IdentifierType ::= id expr’=’Sorc AccessOutputsIDtoSourceStm |id ’=’Sorc AccessOutIDtoSourceStm |Dest’=’id AccessDestinatoIDStm |id ’=’ expr Assign |if ’(’expr’)’ Statement else Statement IfStatement |while’(’expr’)’stm WhileStatement |id’[’expr’]’’=’expr ArrayAssign |for’(’stm’;’ expr’;’ expr’)’ ’{’ Statement ’}’ ForStatement |schedule expList SchedStatement |break BreakStatement |return ReturnStatement |Dst(Exp)=Src(Exp) SingleBitAssign |Dst(index_expression)=Src(index_expression)SliceStoreAssign |id={ExpList} ConcBitvec |id(index_expression)={expList} ConcBitvecLessLength ::= din SwitchSourcedin |cin SwitchSourcecin ::=dout SwitchDestinationdout |cout SwitchDestinationcout ::= Type id VarDecType |Type id ’=’expr VarDecTypeExpr |Type’[’expr’]’id VarDecTypeArray ::= Expression op Expression Listed base on operation |id’[’ Exp ’]’ ArrayLookup |’[’ Exp ’:’ Exp ’]’ SwitchExpression |id Sorc’?’ Exp ConditionExpression |True True() |False False() |integer_lit IntegerLiteral |id IdentifierExpression |Float_lit FloatLiteral
APPENDIX A. ABSTRACT SYNTAX TREE PRODUCTION RULE |operator Exp |id(Exp) |id(index_expression)
Listed base on operation BitPosition IndexLowUp
45
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
46
APPENDIX B. THE JFLEX SPECIFICATION FILE (SCANNER GENERATOR)
B The JFLEX Specification File (Scanner Generator) The specification file for JFlex, PSPL.flex, will generate the Scanner for PSPL. It is presented bellow as follows: /* *Generating Lexical Analyzer for Portable Stream Language Program. *Master Thesis Project * */ %% /*Directives and definitions */ // directives: %debug /*The name of the lexer class */ %class psplLexer /* The lexer class implement this all integer for the tokens */ %implements psplTokens /* the type used for the tokens */ %int %unicode /*To enable char counting */ %char /*To enable line counting */ %line /* To enable column counting*/ %column // code in PsplLexer: %{ Object semanticValue; public int linenr(){ return yyline;
interface which consists
} public int columnnr(){ return yycolumn; 47
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE } /*The token type. */ int token; %} // definitions: /* new lines and spaces */ ID = [a-zA-Z][a-zA-Z0-9_]* Integer_Lit = 0|([1-9][0-9]*) Float_Lit = {INTEGER}{FRACTION}?{EXPONENTIAL}? INTEGER = [0-9]+ FRACTION = "\."{INTEGER} EXPONENTIAL = [Ee](\+|\-)?{INTEGER} Whitespace = {LineTerminator}|[ \t\f] LineTerminator = \r|\n|\r\n InputCharacter = [^\r\n] TraditionalComment = "/*" [^*] ~"*/" | "/*" "*"+ "/" EndOfLineComment = "//" {InputCharacter}* {LineTerminator}? DocumentationComment = "/*" "*"+ [^/*] ~"*/" Comment = {TraditionalComment}|{EndOfLineComment}|{DocumentationComment} %% /* section for regular expressions and actions */ "Streamprog " {return token = STREAMPROG;} "pipeline" {return token = PIPELINE;} "error" {return token = ERROR;} "filter" {return token = FILTER;} "switch" {return token = SWITCH;} "port" {return token = PORT;} "dstream" {return token = DSTREAM;} "cstream" {return token = CSTREAM;} "int" {return token = INT;} "float" {return token = FLOAT;} "return" {return token = RETURN;} "cin" {return token = CIN;} "boolean" {return token = BOOLEAN;} "if" {return token = IF;} "else" {return token = ELSE;} "uminus" {return token =UMINUS;} "while" {return token = WHILE;} "cout" {return token = COUT;} "dout" {return token = DOUT;} "din " {return token = DIN ;} "true" {return token = TRUE;} "false" {return token = FALSE;} "init" {return token = INIT;} "byte" {return token = BYTE;} "void" {return token = VOID;} 48
APPENDIX B. THE JFLEX SPECIFICATION FILE (SCANNER GENERATOR) "schedule" {return token = SCHEDULE;} "map" {return token = MAP;} "split" {return token = SPLIT;} "loop" {return token = LOOP;} "seq" {return token = SEQ;} "join" {return token = JOIN;} "program" {return token = PROGRAM;} "while" {return token = WHILE;} "work" {return token = WORK;} "bitvec" {return token = BITVEC;} "for" {return token = FOR;} "procedure" {return token = PROCEDURE;} "to" {return token = TO;} "null" {return token = NULL;} "#" {return token = ’#’;} "||" {return token = LOGICOP;} "" {return token =SHIFTRIGHTOP;} "%" {return token = ’%’;} "++" {return token =INCREMENTSIGN;} "--" {return token =DECREMENTSIGN;} "connectto" {return token =CONNECTTO;} "mapto" {return token =MAPTO;} "==" {return token =CONSECEQSIGN;} "+=" {return token =ADDASSIGNOP;} ".*" {return token =ARRELEMENTMULT;} "./" {return token =ARRELEMENTDIV;} ".+" {return token =ARRELEMENTADD;} ".-" {return token =ARRELEMENTSUB;} ".’" {return token=ARRAYTRANSPOSE;} "’" {return token=MATRIXTRANSPOSE;} ":" {return token = COLON ;} "!=" {return token = NOTEQSIGN ;} "^" {return token = ’^’;} "~" {return token = ’~’;} "-" {return token = ’-’;} "/" {return token = ’/’;} "*" {return token = ’*’;} "=" {return token = ’=’;} "&&" {return token = AND;} "&" {return token = ’&’;} "|" {return token = ’|’;} ">" {return token = ’>’;} "’ id expr Sorc ’#’expr’ =>’ id Sorc ’#’expr’ => ’id expr id ’#’expr ’=>’Dest id expr ’#’expr’ =>’ Dest id expr’ =>’ Dest id ’#’expr’ ->’Dest id expr ’-> ’Dest id expr’#’expr -> Dest ’;’SwitchStm MapStm MapStmRest SrcList ’=>’ DstList ’;’ MapStm Src SrcRest din cin id ’,’Src Dst DstRest dout cout id ’,’Dst INT FLOAT BYTE BOOLEAN INT’[’expr’]’ BITVEC’’’[’expr’]’ VOID BYTE’[’expr’]’ BITVEC’’ (StmList) id expr’=’Sorc id ’=’Sorc Dest’=’id id ’=’ expr if ’(’expr’)’stm else stm while’(’expr’)’stm id’[’expr’]’’=’expr for’(’stm’;’ expr’;’ expr’)’’{’ stm schedule expList break return exp id = Sorc ? exp
’}’
APPENDIX G. PORTABLE STREAM PROCESSING LANGUAGE-BACKUS NAUR FORM (BNF)
StmList stmRest VarDecList VarDec
VarDecRest Sorc Dest expr
index_expression
expList exprest op Unop
| | | | | ::= ::= ::= ::= | | ::= | ::= | ::= | ::= | | | | | | | | | | ::= | | ::= ::= | ::= ::=
id = Dest ? exp ’(’index_expression’)’ ’=’’(’index_expression’)’ id’=’’{expList’}’ id’(’index_expression’)’’=’’{’explList’}’ ’(’expr’)’’=’’(’expr’)’ stm stmRest ’;’stm VarDec VarDecRest Type id Type id ’=’expr Type’[’expr’]’id ’;’VarDec din cin dout cout expr op expr id’(’expr)’ id’(’index_expression’)’ ’[’index_expression’]’ True False integer_lit id float_lit Unop expr ’(’Type’)’expr expr COLON expr expr COLON NULL NULL COLON expr expr exprest ’,’expr || > % .* ./ .+ .- = && & | > < - / * + + - ^ ~ ++ -- .’ ’ !
91
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
92
APPENDIX H. THE VISITOR INTERFACE
H
The visitor interface
package visitor; import syntaxtree.*; public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public
interface void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit void visit
Visitor{ (StreamProgram n); (PipelineDec n); (FilterDec n); (SwitchDec n); (ProcedureDec n); (Formal n); (InitDec n); (MapStatement n); (IntegerType n); (FloatType n); (ByteType n); (IntegerArrayType n); (BitvecArrayType n); (ByteArrayType n); (BitvecType n); (DstreamType n); (CstreamType n); (VstreamType n); (JoinSwitchStatement n); (SplitSwitchStatement n); (LoopSwitchStatement n); (SeqSwitchStatement n); (Assign n); (WhileStatement n); (IfStatement n); (ScheduleStatement n); (BreakStatement n); (ReturnStatement n); (ForStatement n); (ArrayAssign n); (VarDecType n); (VarDecTypeExpr n); (VarDecTypeArray n); (ConditionExp n); (True n); (False n); (IntegerLiteral n); (IdentifierExpression n); 93
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public 94
void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void
visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit visit
(Minus n); (And n); (LessThan n); (Plus n); (Times n); (GreaterThan n); (Equal n); (NotEqual n); (BitwiseAnd n); (BitwiseXOR n); (LogOR n); (ShiftLeft n); (ShiftRight n); (Not n); (UnPlus n); (PreIncrement n); (PreDecrement n); (UnMinus n); (Identifier n); (BitwiseIOR n); (BitwiseNeg n); (ObjectType n); (ObjectTypeDec n); (Modulo n); (SourceMapStatementdin n); (SourceMapStatementid n); (SourceMapStatementcin n); (DestinationMapStatementid n); (DestinationMapStatementcout n); (DestinationMapStatementdout n); (StreamTypeDecOne n); (StreamTypeDecTwo n); (StreamTypeDecThree n); (StreamTypeDecFour n); (SplitMapSwitchStmMultiple n); (SplitMapSwitchStatement n); (SplitMapSwitchStmToMany n); (SplitMapSwitchStmMult n); (JoinStreamsMapSwitchtoDestOne n); (JoinStreamsMapSwitchtoDesttwo n); (JoinStreamMapSwitchtoDest n); (ConcStreamsMapSwitchtoDestOne n); (ConcStreamsMapSwitchtoDesttwo n); (ConcStreamMapSwitchtoDest n); (AccessOutputsIDtoSourceStm n); (AccessOutIDtoSourceStm n); (AccessDestinatoIDStm n);
APPENDIX H. THE VISITOR INTERFACE public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public public }
void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void void
visit (SwitchSourcecin n); visit (SwitchSourcedin n); visit (SwitchDestinationdout n); visit (SwitchDestinationcout n); visit (IdentifierType n); visit (Division n); visit (FloatLiteral n); visit (BitPosition n); visit (IndexLowToUp n); visit (IndexLuptoMsb n); visit (IndexUp n); visit (SingleBitAssign n); visit (SliceStoreAssign n); visit (SliceAndAlignDestMsb n); visit (SliceAndAlignDestLsb n); visit (ConcBitvec n); visit (ConcBitvecLessLength n); visit (TypeCastExp n); visit (BooleanType n); visit(Index n); visit(Blockstatement n); visit(LowerIndex n); visit(UpperIndex n); visit(BitVExpression n); visit(BitVstatement n); visit(ElemWiseDiv n); visit(ElemWiseSub n); visit(ElemWiseMult n); visit(ElemWiseAdd n); visit(StreamAccessInput n); visit(StreamAccessOutput n); visit(BitSwVExpression n); visit(CloneExpression n);
95
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
96
APPENDIX I. ABSTRACT SYNTAX CONSTRUCTORS
I
Abstract syntax constructors Table I.1: Abstract syntax tree constructors
Categories StreamProgram Abstract class ObjectDec
Abstract class Type
Abstract class StreamType
Abstract class Switches
Classes StreamProgram(ObjectDec ob) PipelineDec (Identifier i, FormalList fl, StreamDecList stl, ObjectTypeDecList ol MapStatementList ml ) FilterDec (Identifier i, FormalList f1, StreamDecList stl, VarDecList vl, InitDec in, ProcedureDecList p1) SwitchDec (Identifier i, FormalList f1, StreamDecList stl, Switches sws) ProcedureDec (Identifier i, StreamType st, StatementList sl) Formal (Type t, Identifier i) InitDec (StatementList sl) MapStatement (SrcList sl, DstList dl)ObjectTypeDec (Type t, Identifier i, ExpList el) IntegerType( ) FloatType( ) ByteType( ) IntegerArrayType(exp e ) BitvecArrayType(exp e1,e2 ) ByteArrayType( exp e) BitvecType (exp e ) IdentifierType (String s) BooleanType() DstreamType() CstreamType() VstreamType() JoinSwitchStatement(SwitchStmList swstl) SplitSwitchStatement(SwitchStmList swstl)
97
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
Table I.2: Abstract syntax tree constructors
Categories Abstr class SwitchStatement
Abstract class SourceMapStatement
98
Classes LoopSwitchStatement(SwitchStatement swst, exp e) SeqSwitchStatement(SwitchStatement swst) SplitMapSwitchStatement(SwitchSource sws,Identifier i) SplitMapSwitchStmToMany(SwitchSource sws,Identifier i,exp e) SplitMapSwitchStmMult(SwitchSource sws,exp e,Identifier i) SplitMapSwitchStmMultiple(SwitchSource sws,exp e1,Identifier i,exp e2) JoinStreamsMapSwitchToDestOne(Identifier i,Exp e1,Exp e2,SwitchDestination swd) JoinStreamsMapSwitchToDestTwo(Identifier i,Exp e1,Exp e2,SwitchDestination swd) JoinStreamsMapSwitchToDest(Identifier i,Exp e,SwitchDestination swd) ConcStreamMapSwitchToDestOne(Identifier i,exp e,SwitchDestination swd) ConcStreamMapSwitchToDestTwo(Identifier i,exp e,SwitchDestination swd) ConcStreamMapSwitchToDest(Identifier i,Exp e1,Exp e2,SwitchDestination swd) SourceMapStatementdin ( ) SourceMapStatementcin ( ) SourecMapStatementid (identifier i)
APPENDIX I. ABSTRACT SYNTAX CONSTRUCTORS
Table I.3: Cont. Abstract syntax tree constructor
Categories Abstract class DestMapStatement
Abstract class Statement
Abstract class VarDec
Abstract class StreamDec
Abstract class Exp
Classes DestinationMapStatementdout( ) DestinationMapStatementcout( ) DestinationMapStatementid(identifier i) AccessOutputsIDtoSourceStm(Identifier i, Exp e, SwitchSource sws) AccessOutIDtoSourceStm(Identifier i, SwitchSource sws ) AccessDestinatoIDStm (SwitchDestination sd, Identifier i) Assign(identifier I, exp e) IfStatement (exp e,Statement s1,Statement s2) WhileStatement (exp e, Statement s) SheduleStatement (expList el) BreakStatement( ) ReturnStatement( exp e ) ForStatement(statement s1, exp e, Statement s2) ArrayAssign(identifier I, exp e1, exp e2) BlockStatement(StatementList sl) ScheduleStatement(expList el) StreamAccessInput(Identifier i,SwitchSource sws,exp e) StreamAccessOutput(Identifier i,SwitchDestination swd,exp e) SliceStoreAssign(IndexExpression ie1,ie2) ConcBitvec(Identifier i,expList el) ConcBitvecLessLength(IndexExpression ie,expList el) SingleBitAssign(Exp e1,e2) VarDecType (Type t,Identifier i) VarDecTypeExpr (Type t, Identifier i, expr e) VarDecTypeArray (Type t, expr e, Identifier i) StreamTypeDecOne(StreamType st,expList el,Type t ) StreamTypeDecTwo(StreamType st, expList el,Type t1,Identifier i,exp e,Type t2 ) StreamTypeDecThree(StreamType st,expList el,Type t,exp e,Type t2 ) StreamTypeDecFour(StreamType st,expList el,Type t,Identifier i ) True() False()
99
PARSING A PORTABLE STREAM PROGRAMMING LANGUAGE
Table I.4: Cont. Abstract syntax tree constructor
Categories
Abstract class IndexExpression
100
Classes IntegerLiteral (int i) IdentifierExp (String s) And (exp e1, exp e2) LessThan(exp e1, exp e2) Plus (exp e1, exp e2) Times (exp e1, exp e2) GreaterThan (exp e1, exp e2) Equal (exp e1, exp e2) NotEqual (exp e1, exp e2) BitwiseAnd (exp e1, exp e2) BitwiseXOR (exp e1, exp e2) FloatLiteral(float f) TypeCastExp(Type t,exp e) BitPosition(Identifier i,exp e) IndexLowToUp(Identifier i,IndexExpression ie) CloneExpression(IndexExpression ie) Not (exp e) UnPlus (exp e) PreIncrement (exp e) PreDecrement (exp e) BitwiseNegation (exp e) LogOR (exp e1,exp e2) ShiftLeft (exp e1, exp e2) ShiftRight (exp e1, exp e2) Modulo (exp e1, exp e2) Minus(exp e1,exp e2) Division(exp e1,exp e2) ElemWiseAdd(exp e1,exp e2) ElemWiseSub(exp e1,exp e2) ElemWiseMult(exp e1,exp e2) ElemWiseDiv(exp e1,exp e2) BitwiseIOR (exp e1, exp e2) Index(exp e1,exp e2) UpperIndex(exp e,StreamType(null) st) LowerIndex(StreamType st(null),exp e)