Reverse Engineering of Computer-Based Control Systems Lonnie R. Welch, Guohui Yu, Binoy Ravindran, Franz Kurfess and Jorge Henriques Department of Computer and Information Science New Jersey Institute of Technology Newark, NJ 07102 e-mail:
[email protected]
Mark Wilson The Naval Surface Warfare Center Silver Spring, Maryland 20903-5640
Antonio L. Samuel and Michael W. Masters The Naval Surface Warfare Center Dahlgren, VA 22448
1
Abstract This article presents a process for the reengineering of computer-based control systems, and describes tools that automate portions of the process. The intermediate representation (IR) for capturing features of computer-based systems during reverse engineering is presented. A novel feature of the IR is that it incorporates the control system software architecture, a view that enables information to be captured at ve levels of granularity: the program level, the task level, the package level, the subprogram level, and the statement level. A reverse engineering toolset that constructs the IR from Ada programs, displays the IR, and computes concurrency, communication and object-orientedness metrics is presented. Also described is the design of hypermedia techniques that enhance the usability of the reverse engineering tools.
2
1 Introduction A computer-based system has many characteristics, including performance, timeliness, availability, dependability, safety and security. Furthermore, such a system typically performs many related functions concurrently, interacts with the environment and many human operators and/or clients simultaneously, consists of many interconnected processing elements, contains many millions of lines of code, takes years to develop from rst concept formulation to nal deployment, and has development costs of tens, or even hundreds of millions of dollars. Computer-based systems generally address nontransient requirements that simply cannot be addressed with simpler solutions. Thus they tend to be characterized by long life cycles, often spanning decades. During such extended life cycles, change is inevitable in many dimensions, such as operational environment, system requirements, and technology base. Because of the time and cost of development of computer-based systems, and because of the infrastructure which includes highly trained personnel, hardware and support tools, documentation, test procedures, and many other components needed for their development and continued support once deployed, there is enormous nancial pressure to meet the need for change through evolution rather than revolution. This need has spawned the discipline of reengineering, the systematic application of methodology and tools for managing the evolutionary transformation of existing computer-based systems to encompass new or altered requirements and to transport such systems into new environments and onto new technology bases. This paper describes a reengineering process that is appropriate for computer-based control systems, such as the U.S. Navy's AEGIS system [23, 19]. Section 2 describes a systems reengineering process which has been developed for transitioning Navy systems to meet the challenges of exploiting the technology of the present and of the future. Section 3 discusses a reverse engi3
neering process that produces an intermediate representation (IR) and metrics that characterize a legacy system. Section 4 presents tools that implement the reverse engineering process. The design of hypermedia-based techniques for navigating the IR are described in Section 5.
2 Computer-Based Systems Reengineering This section describes a process for reengineering which has evolved in conjunction with eorts to reengineer portions of the AEGIS Weapon system [19, 23]. The diagram shown in Figure 1 indicates the major inputs 1 and outputs of the reengineering process, which consist of the following items:
Legacy system|the system to be reengineered (consisting of hardware, human and software elements) and all of its artifacts.
IR1|an abstract representation of the legacy system, in machine-processable form.
Legacy system metrics|concise characterizations of important aspects of the legacy system.
Reengineering decision|the answer to the question \Which components from the legacy system should be reengineered?".
New requirements and objectives|a description of the constraints and desirable properties that the reengineered system is to have.
IR2|an abstract representation of the new system, in machine-processable form.
New system metrics|concise characterizations of important aspects of the new system.
1 The diagram
uses structured design notation, wherein circles denote processes, edges indicate the ow of data
among processes, and boxes indicate entities that are external to the process.
4
Legacy System Metrics for
Reeng.
Legacy
Decision
1
System
Reverse Engineering
New Requirements and
2
Objectives
IR1: Legacy Design, Code
Software Transform.
New Design, Code
Metrics for
3
New System
IR2:
System Configuration
New Configuration
Figure 1: Steps of the automated reengineering process.
5
New con guration|a description of the interactions of the hardware, operating system, application software and human elements of the new system.
As indicated in Figure 1, the rst step of the reengineering process is reverse engineering, i.e., the capture of important features of the legacy system's hardware, software, and human elements. The reverse engineering process produces several outputs: IR1, legacy system metrics, and the reengineering decision. Given IR1, the legacy system metrics, and the new requirements and objectives, the task of software transformation manipulates IR1 until it satis es the new goals and constraints. The transformation task is guided by the metrics for the legacy system. Transformation produces IR2 and metrics for the new software. Transformation is succeeded by con guration, which marries hardware, operating system, transformed software, and human elements. Software components are optimized for the execution paradigm provided by the hardware-operating system platform. The optimized software components are partitioned into tightly coupled clusters, which are assigned [24, 25, 36] onto the hardware platform in a way that (1) satis es the new system requirements and (2) considers the new system objectives. The output of the con guration process is a description of the partitioning, a speci cation of how partitions are assigned to processors, and a collection of metrics characterizing the new con guration. Following reengineering, an assessment of the reengineered product is made by comparing its metrics against the metrics for the pre-reengineered system.
3 The Reverse Engineering Process The goal of reverse engineering is to enable systems engineers and automated tools to understand the important features of a legacy system's hardware, software, operating system, requirements, documentation and human elements. An appropriate approach for the reverse engineering of 6
1.1 Reeng.
Make
Decision
Translation Decision
Hardware Humanware O.S. Document. Requirements
CMS-2, Assembly
Trans-Decision CMS-2, Assembly
Legacy Software
1.2
1.3 Context
Translation
Capture Ada
1.4 Ada Parsing
SymTab, StmtTab
IR1: Legacy Code, Design
StmtTab GDG, CFG
1.5 Dependence, Flow Analysis
Reeng. Decision
CRG 1.6
1.7
Interaction Analysis
Metrics
SymTab, StmtTab
Computation
Figure 2: The reverse engineering process.
7
Metrics for Legacy System
complex control systems is indicated in Figure 2. The rst step (process 1.1) is to make a decision about whether the legacy software should be translated. 2 The decision is based on metrics and on managerial and strategic factors [21]. If the decision is \no", then the reengineering process terminates. Otherwise, the legacy software is translated and incorporated into IR1. Additionally, the important aspects of the hardware, operating system (O.S.), human elements, documentation and requirements are captured in IR1. The translated code is parsed, and the symbol table (SymTab) and the statement table (StmtTab) [14, 37] are extracted and placed into IR1. Given SymTab and StmtTab, process 1.5 extracts a general dependence graph (GDG) that represents statement-level precedence relations, and a control ow graph (CFG) that denotes the ow among statements. Interaction analysis (process 1.6) captures interactions among tasks, packages, and procedures in a call-rendezvous graph (CRG). The system characterization contained in the GDG, CFG and CRG is too large and too complex for human comprehensibility, or even for ecient machine processing. Thus, the metrics computation phase (process 1.7) summarizes essential system properties in a concise form.
4 Tools for Reverse Engineering and Metrics Computation The reverse engineering eorts of the NJIT/NSWC team have resulted in realizations of software tools for reverse engineering of Ada programs. This section presents various aspects of the toolset, including: (1) the intermediate representation (IR), (2) extraction of the IR, (3) use of the IR to compute metrics, and (4) a graphical user interface for viewing the IR and the metrics. 2 Since this process evolved
in conjunction with the reengineering of U.S. Navy systems, Figure 2 indicates that
legacy systems are implemented in CMS-2 and assembly languages, and that translation produces Ada programs. However, the process is not language-dependent, and could be adapted to work for the reengineering of programs in other languages by changing the translator and parser.
8
4.1 Intermediate Representation Large embedded software systems have a layered/tiered structure that we have termed the control system software architecture (CSSA). Tier 1 of the CSSA consists of a set of executable programs,
possibly implemented in dierent languages. At tier 2 are tasks (independent threads of control), which may share resources, and are permitted to run concurrently. Tier 3 is composed of modules with multiple entry points, ADTs, and objects. The elements of tier 3 are implemented in terms of subprograms|the tier 4 elements. Subprograms are implemented as a collection of statements/instructions (tier 5 elements). This section describes tools for extracting the IR at tiers 2, 3, 4 and 5. Since IR1 and IR2 represent the same software system at dierent phases of the reengineering process, their structure is identical; thus, the term IR is used in the remainder of this paper to represent both IR1 and IR2.
4.1.1 The Task, Package/Class and Subprogram Tiers At tier 2 are tasks (independent threads of control), which may share resources, and are permitted to run concurrently. The task tier is represented in IR by the task rendezvous graph, a directed graph, TRG = (V; E ), wherein a vertex v 2 V denotes a task object, f (v ), and an edge (x; y ) 2 E indicates that the code of task object f (x) initiates a rendezvous with an entry provided by task object f (y ). Tier 3 is composed of modules with multiple entry points (as in CMS-2), ADT packages and generic instances (as in Ada, Modula, and Clu) and instances of object classes and templates (as in C++, Smalltalk and Eiel). Tier 3 is modeled by a directed graph, CGRAPHP = (V; E ), where: a vertex v 2 V denotes a module instance, f (v ), and an edge (x; y ) 2 E indicates that the code of instance f (x) calls some subprogram(s) provided by instance f (y ). 9
The elements of tier 3 are implemented in terms of subprograms (or methods)|the tier 4 elements. At the granularity of the subprogram, a directed graph, CGRAPHS = (V; E ), is used to represent the call relationships by letting each vertex m 2 V denote a subprogram f (m), and each edge (m; n) 2 E indicate that the code of subprogram f (m) calls subprogram f (n). It is possible for a subprogram to initiate a rendezvous with tasks or to call subprograms exported by packages, in addition to calling other subprograms. Likewise, in addition to rendezvousing with other tasks, tasks may call subprograms. Similarly, packages may contain calls to subprograms and rendezvouses with task entries. Thus, the IR contains the call-rendezvous graph (CRG), which combines the nodes and vertices of TRG, CGRAPHP , and CGRAPHS , and inserts directed edges representing calls from tasks to subprograms and packages, and indicating rendezvous initiations from subprograms and packages to tasks. A sample CRG is given in Figure 3. Construction of the IR at tiers 2, 3 and 4 is performed by the toolset as follows. Since the use relation among program units (tasks, packages, and subprograms) is explicitly indicated in the source code, call graphs are constructed during program parsing. During parsing, information is collected by action routines inserted within the parser production rules. From this information, a symbol table is built for each program unit; the table contains, besides other relevant information, the call list for that unit. The complete application CG is a union of all the individual units' call lists. Due to language constructs and keywords that appear in the source code, the task rendezvous graph is also constructed during program parsing. Given the call graph and the task rendezvous graph, the call rendezvous graph is constructed as a union of the previous two.
4.1.2 The Statement/Instruction Tier At tier 5, several important features are captured in the IR. In particular, the control ow 10
Call-rendezvous Graph (CRG)
rendezvous call package
subprogram
task
Figure 3: A call-rendezvous graph.
11
WITH Output-File, Input-File, Command-Line-Processor, Error-Log; WITH Text-IO; USE Command-Line-Processor; PROCEDURE Concatenate IS Output-File-Id : Output-File.File-Type; input-File-Id : Input-File.File-Type; Input-File-Name-Length, In-Line-Length : INTEGER; Input-File-Name, In-Line : STRING(1..400); Get-Y-Or-N : Character; Command-Line-Option : Command-Line-Processor.Command-Line-Options-Type; Copy-File : BOOLEAN; BEGIN S1 Output-File.Create(Output-File-Id, Command-Line-Processor.Output-File-Name); S2 Error-Log.Open(""); S3 Command-Line-Option := Command-Line-Processor.Command-Line-Options; S4 WHILE NOT (Command-Line-Processor.Is-End-Of-File-List) LOOP S5 Command-Line-Processor.Next-Input-File(Input-File-Name, Input-File-Name-Length); S6 Copy-File := TRUE; S7 IF (Command-Line-Option = Inquire) OR (Command-Line-Option = Inquire-Pager) THEN S8 Text-IO.Put("Copy File: " & Input-File-Name(1..Input-File-Name-Length) & " (Y/N) ?"); S9 Text-IO.Get(Get-Y-Or-N); S10 IF (Get-Y-Or-N /= 'Y') AND (Get-Y-Or-N /= 'y') THEN S11 Copy-File := FALSE; END IF; END IF; S12 IF Copy-File THEN S13 Input-File.Open(Input-File-Id, Input-File-Name(1..Input-File-Name-Length)); S14 IF (Command-Line-Option = Pager) OR (Command-Line-Option = Inquire-Pager) THEN S15 Output-File.Put-Line(Output-File-Id, "{::::::::::::"); S16 Output-File.Put-Line(Output-File-Id, "{" & Input-File-Name(1..Input-File-Name-Length)); S17 Output-File.Put-Line(Output-File-Id, "{::::::::::::"); END IF; S18 Input-File.Get-Line(Input-File-Id, In-Line, In-Line-Length); S19 WHILE NOT Input-File.End-Of-File(Input-File-Id) LOOP S20 Output-File.Put-Line(Output-File-Id, In-Line(1..In-Line-Length)); S21 Input-File.Get-Line(Input-File-Id, In-Line, In-Line-Length); END LOOP; S22 Input-File.Close(Input-File-Id); END IF; END LOOP; S23 Output-File.Close(Output-File-Id); END Concatenate;
Figure 4: The Concatenate procedure. 12
S0
S1
S2
S3
S4
S23
R121
R41
S6 S7
S5
R71
S8
S14
S13
S9
S18
R141
S19
S15
R191
S10 -S12
S16
S17
S20
S21
S22
Figure 5: The GDG of the Concatenate procedure.
13
graph (CFG) is extracted to represent the sequential ow of control dictated by the statements in the source code. In addition, graphs that represent statement-level precedence relations due to control dependences, data dependences, and code dependences are captured. Dependence graphs represent program statements as nodes and use edges to denote statement ordering implied by the dependences in a source program. In the data dependence graph (DDG) a directed edge denotes a data dependence (destination and source nodes need the same value). The instance dependence graph (IDG) uses undirected edges to denote instance dependences (which occur when two statements use operations exported by the same instance). The subprogram dependence graph (SDG) uses an undirected edge to denote when two statements use the same subprogram. A directed edge in the control dependence graph (CDG) denotes that execution of the destination statement depends on the a decision made by the source statement. The union of the DDG, IDG, SDG and CDG forms the General
Dependence Graph (GDG).
For example, an Ada procedure is listed in Figure 4, and the corresponding GDG is shown in Figure 5 (solid lines denote DDG and CDG edges, and dashed lines indicate IDG and SDG edges). GDGs are constructed by de ning, for each subprogram, a Statement Table. For example, the statement table for procedure Concatenate of Figure 4 is shown in Figure 6. Note that the statement table contains the following attributes for each statement: 1. Statement Type indicates the type of the statement (e.g., method call, if-then-else, or while loop)
2. Dependence Nesting Level keeps track of the number of region nodes on the path from the root to the statement. (A region node is de ned as a virtual node which has a zero execution time. A region node is used to group all the nodes that have a dependence 14
Statement Label
Statement Type
Dependence Nesting Level
Address
ADT Instances Used
S1
call
1
1
CLP, OF
S2
call
1
2
null
S3
assign while
1
3
CLP
1
4
CLP
S4
Statement Label
Statement Type
S5
call
S6
assign
S7
Dependence Nesting Level
if
Statement Label
Parameter List
null Command_Line_Option, CLP.Command_Line_Option
2
5
CLP
2
6
null
2
7
null
null
Command_Line_Option, Inquire_Command_Line_Option, Inquire_Pager
ADT Instances Used
Address
Parameter List
Child
Input_File_Name_Length
null
null
null
null
3
8
CLP
call
3
9
Text_IO
Get_Y_Or_N
S10
if
3
10
Text_IO
Get_Y_Or_N
assign
null
Copy_File, TRUE
call
S11
null
null
Input_File_Name, Input_File_Name_Length
S9
Statement Type
null
Parameter List
S8
Statement Label
null null
CLP.Is_End_Of_File_List
ADT Instances Used
Dependence Nesting Level
Child
Output_File_Id, CLP.Output_File_Name
Address
Statement Type
Child
Dependence Nesting Level
Address
4
11
ADT Instances Used null
Parameter List Copy_File, FALSE
Child
null
Child null
Child null
CLP - Command_Line_Processor OF - Ouput_File
Figure 6: The Statement Table of the Concatenate procedure.
15
Child
Child
null
null
null
null null
relation on the same node, by forcing all those nodes to depend on the region node, and letting the region node depend on the single node.) 3. Address is a line number in the source code. 4. ADT Instances Used is the set of ADT instances directly used by the statement. 5. Parameter List is the set of variables used by the statement. 6. Child points to a statement table containing all its dependents in another dependence level. If a statement has more than one group of statements depending on it, a Child eld is created for each of the groups. This occurs when the statement is an if-statement, case, or loop, which spawn multiple execution paths. For all statements which do not create multiple branches, the child eld is just a null pointer.
4.2 The Metrics Tool As indicated in Figure 7, metrics are computed by the tools to assess object-orientedness, communication, concurrency, maintainability, and quality. The object-orientedness metric is measured by considering the properties of intra-component cohesion, inter-component coupling, information hiding and encapsulation; these properties are computed by considering information contained in SymTab and StmtTab. Communication and concurrency among program units (subprograms, packages, and tasks) are computed by considering SymTab, StmtTab, CRG, CDG, DDG and CFG; these metrics are represented as edge weights in a CRG. Maintainability is computed as a function of object-orientedness and size, and quality is computed as a function of object-orientedness and maintainability. The tools also compute Halstead size and volume metrics [8], and the McCabe complexity metric [12].
16
IR1: Legacy Code, Design
CFG SymTab
1.7.1
Symtab, StmtTab
1.7.2
McCabe Complexity
Halstead
CRG, GDG, CFG, SymTab CRG, StmtTab, SymTab
1.7.3
1.7.4
1.7.5
ObjectOrientedness
Communication
Concurrency
OO H MCC
Comm
Metrics for Legacy System
Conc
H,OO 1.7.6
OO,Maint 1.7.7
Maint
Qual Quality
Maintainability
Figure 7: The metrics computation process.
17
4.3 Graphical User Interface to Reverse Engineering Tools To assist with understanding of legacy systems, an X-window graphical user interface (GUI) has been produced as part of the toolset. Figure 8 illustrates some of the GUI windows. The \main" window of the GUI allows selective viewing of call graphs, task rendezvous graphs, callrendezvous graphs, source code, data dependence graphs, control dependence graphs, control
ow graphs and metrics. The graph window shown in the gure is a call-rendezvous graph (CRG). The metrics selection window is also shown, as well as the window for the information hiding metric.
5 Navigating Intermediate Representations via Hypermedia A prerequisite for successful reengineering is an understanding of the software system across dierent levels of abstraction. The information contained in the intermediate representation (IR) produced by the tools helps to obtain such an understanding, but it is helpful to the human reengineer only after the rudiments of the software have been comprehended. The reason for this is that features of entities on a particular tier are represented in separate graphs. A similar observation holds for information captured on dierent tiers. To address this situation, we have designed a hypermedia-based processor for the reverse engineering tools; it retains the contextual and pragmatic relationships among the dierent pieces of information in the intermediate representation. When trying to understand a program, it is often necessary to switch between dierent levels of abstraction. One might, for example, start at a high level, looking at the main components of the program, and the way they are integrated. In order to understand the role of one of the components more clearly, a more detailed look at it is necessary, without losing the information 18
Figure 8: The graphical user interface (GUI).
19
displayed previously. Such techniques are referred to as `zooming'. For example, an engineer may begin with a view of the directed call graph at the package/class level. A package or class can be inspected more closely by opening another window displaying its contents. In order to keep track of the hierarchical relationships between windows, it is useful to draw thin lines between the detailed and the abstract views of the package, as shown in Figure 9. Folding editors or outline modes make uses of these techniques, and can be adapted for reverse engineering with moderate eort. Linking textual and graphical representations of components enhances understanding of a program. One technique applies operations like highlighting to both representations simultaneously, thus emphasizing the correspondence of particular elements in the dierent representations. Going one step further, the inspection of one component in one representation triggers the display of that component in the other representation as well. Figure 9 sketches the integration between graphical and textual representations. The integration of dierent views can be applied on various levels of abstraction. On the statement/instruction tier, the picture becomes more complicated, since dierent kinds of dependence graphs show dierent kinds of relationships between statements. The help provided by a method like highlighting corresponding statements in dierent views becomes even more meaningful. On the same level in the textual mode, hypermedia can provide within the source code the information residing in the edges of the respective dependence graph. Note, however, that in this case the information is not visually displayed; it can only be exploited by following the respective link to where it points. Care must also be taken to distinguish the dierent dependences that may exist for one statement; this can be done via color-coding, or via pop-up menus for selecting the desired dependence. Following such a link can be done by ensuring that as the focus of the current view changes, the position in the source code to which the link points 20
Package1
Source Code
Interface: types;
Package1
procedure1
Si
procedure2 Sj
Package2
Package3 Cross-linking
GDGs Cross-linking
Package4
Package5
Package6
Si
Package7 Sj
Figure 9: Zooming to get detailed views, and cross-linking the same object in dierent views.
21
becomes the focus. Especially on the lower tiers of the software architecture, the wealth of information provided in the intermediate representation can be very helpful for understanding the intricacies of the program. However, it can also present a major challenge to the engineer. One reason for this is that there are separate graphs displaying dierent dependences, but all the graphs refer to one program. Some particular elements will be present in dierent graphs, possibly in dierent arrangements, and thus can be dicult to identify. The following techniques can help the user to maintain focus throughout the inspection of one particular part of the program, while possibly viewing dierent types of dependence graphs:
Cross-Linking: Extend highlighting by using dierent colors or other visual clues. The dierent graphs are still displayed separately, but the system helps with identi cation of speci c elements in separate graphs.
Uniform View: Use the set of all nodes within a procedure as basis for the graphical display, and provide the user with a choice of link types corresponding to the dierent dependences. This allows the simultaneous display of several dependences, using colors or line thickness to distinguish between the dependences.
Overlays: Another way of inspecting dierent dependence aspects of a program is to use one dependence graph as starting point, and to \blend in" the edges and/or nodes representing another type of dependence. This way one can study a particular aspect of a program, and add another one without having to reorient because the arrangement of the nodes has changed.
22
6 Related Work The authors have been involved in two eorts [19] to reengineer portions of the AEGIS Weapon System from CMS-2 to Ada, and to migrate from militarized AN/UYK-43s to commercial workstations. These projects were performed for two primary reasons: to aid in the re nement of a process for reengineering control systems, and to provide proven algorithms for an experimental open system hardware and software environment (HiPer-D) directed at de ning the future architecture and functionality of Navy ship computer systems. The rst project involved the reengineering of Weapon Selection, a critical module that employs a rule-based approach to evaluate tactical situations and recommends when and how targets should be engaged. The second project reengineered the Surface Operations module, which is responsible for making recommendations about steering a vessel to reach or to avoid other vessels. Additional details on this project can be found in several publications, which are organized by topic below:
metrics and analysis techniques: [33, 26, 35, 29, 24]
case studies and methodology: [32, 19, 31]
partitioning and mapping: [24, 26, 28]
real-time and concurrency: [30, 25, 11, 27, 35, 36, 22]
Related work has also been performed within other projects. In [3], an approach is presented for capturing abstractions inherent in software systems and for transforming those abstractions into an object-oriented paradigm; the focus was not on concurrency, but large-scale systems were considered. The consideration of concurrency is proposed in [13], by considering the translation of operating system calls into Ada constructs. Techniques and tools have been developed for source-to-source translation of program code [18, 1]; these tools are pragmatic, allowing a 23
reengineered system to become operational quickly, but they do not attempt signi cant transformation. Additionally, several techniques and tools have been developed to perform basic dependence analysis, including the Xinotech program composer [34], a tool and language independent IR developed by MITRE [17], and Re ne [15], which performs reverse engineering of code written in Fortran, Cobol, C and Ada. However, none of these tools attempts to perform the analysis required for enhancement of concurrency and object-orientedness, or for partitioning and mapping. Other techniques and tools for dependence analysis are presented in [6, 16, 4]. A hierarchical approach to reverse engineering was taken in [9], but the levels of the hierarchy were not based on granularity, as in our model, but consisted of implementation, structure, function and domain levels.
7 Conclusions This paper describes a comprehensive process for the reengineering of computer-based systems. It considers the entire system, not just software. The robustness of the process is seen by noting that it encompasses all major phases necessary for deploying a reengineered legacy system, not just the phases of reverse engineering and translation of software. A major strength of the process is that it has been applied to components of the AEGIS Weapon System. Furthermore, a reengineering analysis toolset for constructing the IR of software architectures is presented. Building on the IR, the paper presents a metrics-based approach to reverse engineering. Ongoing work includes the application of the reengineering process to increasingly complex portions of the AEGIS Weapon System, application of the process to other computer-based systems, and automation of the transformation and con guration phases of the reengineering process. 24
8 Acknowledgements This work is supported in part by The U.S. NSWC (under contracts N60921-93-M-1912, N6092194-M-G096, N60921-94-M-2555, N60921-94-M-1960, N00178-95-R-2007, and N60921-95-M-0311), by the U.S. ONR (under contract N00014-92-J-1367), and by the State of New Jersey (under contract SBR-421290). The authors would like to thank Bob Harrison, Judy Haney, Bill Farr, Harry Crisp, Cuong Nguyen, Mrinalini Lankala, Onno van Roosmalen, Frank Calliss, and the anonymous reviewers for discussions and comments that have contributed to the quality of the work presented in this paper.
References [1] G. Arango et al., \Maintenance and Porting of Software by Design Recovery," Proceedings of The Conference on Software Maintenance, pages 42-49, IEEE CS Press, 1985.
[2] R. A. Ballance, A. B. Maccabe, and K. J. Ottenstein. The Program Dependence Web: A Representation Supporting Control-, Data-, and Demand-Driven Interpretation of Imperative Languages. In Proceedings of the ACM SIGPLAN'90 Conference on Programming Language Design and Implementation, pages 257{271. ACM, June 1990.
[3] T. J. Biggersta, \Design Recovery for Maintenance and Reuse," IEEE Computer,
22(7),
July
1989. [4] C. Castells-Scho eld, \Engineering a Language-Independent Approach to Parsing for Analysis and Testing," Vitro Tech. Journal, 8(1), 1990. [5] R. Cytron, J. Ferrante, B. K. Rosen, and M. N. Wegman. Eciently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Trans. on Programming Languages and Systems, 13(4), pages 451{490, October 1991.
25
[6] S. Dietrich and F. Calliss, \A Conceptual Design for a Code Analysis Knowledge Base," Software Maintenance: Research and Practice, 4, 1992.
[7] J. Ferrante, K. J. Ottenstein, and J. D. Warren. The Program Dependence Graph and its Use in Optimization. ACM Trans. on Programming Languages and Systems, 9(3), pages 319{349, July 1987. [8] M. Halstead, Elements of Software Science, North Holland, 1977. [9] M. Harandi and J. Ning, \Knowledge-Based Program Analysis," IEEE Software, 7(1), 1990. [10] M. J. Harrold, B. A. Malloy, and G. Rothermel. Ecient construction of program dependence graphs. Technical Report 92-128, Clemson University, December 1992. [11] Franz Kurfess, Xavier Pandol , Z. Belmesk, Wolfgang Ertel, Reinhold Letz, and Johannes Schumann. PARTHEO and FP2: Design of a parallel inference machine. In P.C. Treleaven, editor, Parallel Computers: Object-Oriented, Functional and Logical, chapter 9, pages 259{297. Wiley,
Chichester, 1989. [12] T. J. McCabe, \A Software Complexity Measure," IEEE Transactions on Software Engineering, 2(6),
Dec. 1976, pages 308-320.
[13] N. Prywes, G. Ingargiola, I. Lee and M. Lee, \Reengineering Concurrent Software into Ada," Proceedings of The Fourth Systems Reengineering Technology Workshop, pages 157-177, Naval Surface
Warfare Center, February 1994. [14] B. Ravindran, \Extracting parallelism at compile-time through dependence analysis and cloning techniques in an object-based paradigm," M.S. Thesis, New Jersey Institute of Technology, May 1994. [15] Reasoning Systems, Palo Alto, CA, \Re ne Language Tools," 1993. [16] C. Rich and R. Wills, \Recognizing a Program's Design: A Graph-Parsing Approach," IEEE Software, 7(1), 1990.
26
[17] H. Rubenstein, R. Piazza, and S. Roberts, \Separating Parsing and Analysis in Reverse Engineering Tools," Proceedings of the Working Conference on Reverse Engineering, May 1993. [18] C. H. Sampson, \Translating CMS-2 to Ada," Proceedings of The Fourth Systems Reengineering Technology Workshop, pages 143-156, Naval Surface Warfare Center, February 1994.
[19] A. L. Samuel, E. Sam, J. A. Haney, L. R. Welch, J. Lynch, T. Mot, and W. Wright, \Application of a Reengineering Methodology to Two AEGIS Weapon System Modules: A Case Study in Progress," Proceedings of The Fifth Systems Reengineering Technology Workshop, Naval Surface Warfare Center, February 1995. [20] M. Sitaraman, L. R. Welch and D. E. Harms, \On Speci cation of Reusable Software Components," International Journal of Software Engineering and Knowledge Engineering, World Scienti c, 3(2),
June 1993, pages 207-229. [21] H. M. Sneed, \Economics of Software Re-engineering," Software Maintenance: Research and Practice, John Wiley and Sons, 3(3), Sept. 1991, pages 163-182.
[22] R. A. Steigerwald and L. R. Welch, \Reusable Component Retrieval for Real-Time Applications," Proceedings of the First IEEE Workshop on Real-Time Applications, May 1993.
[23] K. J. Stein, \Aegis System Tested Successfully," Aviation Week and Space Technology, pages 36-40, April 7, 1975. [24] J. P. C. Verhoosel, L. R. Welch, D. Hammer, and A. D. Stoyenko, \Assignment and Pre-Runtime Scheduling of Object-Based, Parallel Real-Time Processes," IEEE Symposium on Parallel and Distributed Processing, pages 638-645, Oct. 1994.
[25] J. P. C. Verhoosel, L. R. Welch, D. K. Hammer, A. D. Stoyenko, and E. J. Luit, \A Formal Deterministic Scheduling Model for Object-Based, Hard Real-Time Executions," Journal of RealTime Systems, 8(1), January 1995.
[26] L. R. Welch, \Assignment of ADT Modules to Processors," Proceedings of the International Parallel Processing Symposium, pages 72-75, March 1992.
27
[27] L. R. Welch, A. D. Stoyenko, T. J. Marlowe, \Response Time Prediction for Distributed Periodic Processes Speci ed in CaRT-Spec," Control Engineering Practice, 3(5), May 1995, pages 651-664. [28] L. R. Welch, A. D. Stoyenko and S. Chen, \Assignment of ADT Modules with Random Neural Networks," The Hawaii International Conference on System Sciences, pages II-546-555, Jan. 1993. [29] L. R. Welch, \Cloning ADT Modules to Increase Parallelism: Rationale and Techniques," Fifth IEEE Symposium on Parallel and Distributed Computing, pages 430-437, December 1993.
[30] L. R. Welch, \A Parallel Virtual Machine for Programs Composed of Abstract Data Types", IEEE Transactions on Computers, 43(11), Nov. 1994, pages 1249-1261.
[31] L. R. Welch, A. L. Samuel, M. Masters, R. Harrison, M. Wilson and J. Caruso, \Reengineering Complex Computer Systems for Enhanced Concurrency and Layering," Journal of Systems and Software, 30(2), pages 45-70, July 1995.
[32] L. R. Welch, J. A. Haney, A. L. Samuel, R. D. Harrison, J. Lynch, M. W. Masters, T. Mot, B. Ravindran, E. Sam, and W. Wright, \Reengineering of Legacy Systems: Toward an Automated Approach," Proceedings of The Fifth Systems Reengineering Technology Workshop, Naval Surface Warfare Center, February 1995. [33] L. R. Welch, G. Yu, J. Verhoosel, J. A. Haney, A. L. Samuel, and P. Ng, \Metrics for Evaluating Concurrency in Reengineered Complex Systems," Annals of Software Engineering,
1(1),
Spring
1995. [34] Xinotech Research Inc., Minneapolis, MN, \The Xinotech Program Composer 2.0," 1992. [35] G. Yu and L. R. Welch. Program Dependence Analysis for Concurrency Exploitation in Programs Composed of Abstract Data Type Modules. In Sixth IEEE Symposium on Parallel and Distributed Processing, pages 66-73, October 1994.
[36] G. Yu and L. R. Welch, \A Novel Approach to O-line Scheduling in Real-Time Systems," Informatica, Special Issue on Parallel and Distributed Real-Time Systems, 19(1), pages 71-82, Feb.
1995.
28
[37] G. Yu, \Identifying and Exploiting Concurrency in Abstract Data Type-based Systems," PhD Thesis, New Jersey Institute of Technology, Sept. 1995.
29