hardware and software partitions enables a timed co-simulation. One of ... One of the main advantages of SPACE is its easiness to move modules from hardware to software ... During the refinement, the RTOS model is replaced by a custom RTOS. Interface .... Moreover, as the modules are written in SystemC, an interface.
SPACE: A Hardware/Software SystemC Modeling Platform Including an RTOS J. Chevalier, O. Benny, M. Rondonneau, G. Bois, E. M. Aboulhamid and F.-R. Boyer ´ Electrical Engineering Department, Ecole Polytechnique de Montr´eal P.O. Box 6079, Succ. Centre-Ville, Montr´eal, Qu´ebec, Canada, H3C 3A7
Abstract This work attempts to enhance the support of embedded software modeling with SystemC 2.0. We propose a top-down approach that first lets designers specify their application in SystemC at a high abstraction level through a set of connected modules, and then simulate the whole system. Then, the application is partitioned in two parts: software and hardware modules. Each partition can be connected to our platform that includes a commercial RTOS executed by an ARM ISS scheduled by the SystemC simulator. A separated simulator for hardware and software partitions enables a timed co-simulation. One of our major contributions is that we can easily move a module from hardware to software (and vice versa) to allow architectural exploration.
1
Introduction
There is a need for a system design language that describes functionality of both software and hardware. It must allow the system to be defined, first without making assumptions about the implementation, and then to be refined into the exact implementation with hardware and software components. During the last recent years, some efforts have been made to introduce object-oriented (OO) languages for system-level modeling. Object-oriented languages such as C++ reduce the design and verification time by raising the abstraction level of system specifications, which gives better component reusability. Also, object-oriented libraries for hardware designs are an acceptable proposition for managing the exponential growth of complexity of embedded systems. It appears today that SystemC is the leader in system-level modeling with C++. The SystemC approach consists of a progressive refinement of specifications. The design cycle starts with an abstract high-level untimed or timed functional (UTF/TF) representation that is refined to a bus-cycle accurate and then an RTL (Register Transfer Level) hardware model. SystemC is not a design methodology but it does propose various layers of abstraction that are useful for specification capture in the early stages of a design flow. One of the problems encountered with SystemC 2.0 is the lack of features to support embedded software modeling. Of course, at a high level of abstraction (e.g. UTF/TF), SystemC allows the use of a common language for software and hardware specifications, and the simulation of the whole system. However, during the simulation, the scheduler, responsible for determining which thread will run next, manages identically both software and hardware threads. It means that systems with hard real-time constraints requiring an RTOS (Real-Time Operating System) based on a preemptive priority-based kernel cannot be modeled naturally. Such RTOS provide a very useful abstraction interface between applications with hard real-time requirements and the target system architecture. As a consequence, availability of RTOS models is becoming strategic inside H/S (hardware/software) co-design environments [BB03].
Thus, one of the main objectives of this paper is the development of a SystemC platform for architectural exploration, named SPACE (SystemC Partitioning of Architectures for Co-design of Embedded systems). SPACE enhances the support for embedded software modeling, by encapsulating RTOS functionalities into an API (Application Programmable Interface) allowing the use of a common language. The aim of the API is to translate the SystemC code into system function calls suitable for the operating system. In this work, µC/OS-II [Lab02] has been selected as an example, but the mapping of the API can be easily modified to incorporate other (commercial) operating systems such as VxWorks [httf] or QNX [htte]. As a consequence, the hardware and software modules can be specified with UTF/TF SystemC. Apart from the fact that software modules (threads) will require priority levels, all the services offered by the RTOS can be called through existing SystemC functions. Afterwards, software modules are compiled, and the obtained binary file (linked with the RTOS kernel) is executed on the processor. In SPACE, the processor is modeled by an ISS (Instruction Set Simulator) for which a SystemC wrapper has been added. The ISS and hardware modules are considered as part of the (traditional) SystemC simulation. During the simulation, when the scheduler gives the control to the ISS, the corresponding software thread with the highest priority, and ready to run, is executed. The communication between hardware modules and software modules is done through an abstract communication protocol using a TLM (Transactional Level Model). One of the main advantages of SPACE is its easiness to move modules from hardware to software (or vice versa) during the architectural exploration. Except for the thread priorities that could be modified, we only need to recompile and simulate. The paper is organized as follows. Section 2 of this article discusses about the related work and underlines our objectives. In Section 3, we describe SPACE and its methodology. Section 4 describes in detail the embedded software environment, while Section 5 describes the communication channel interface. In section 6, we show an application using SPACE and we present simulation results. Finally, Section 7 presents the conclusion and future work needed to increase the functionality of SPACE.
2
Related Works and objectives
Currently in the majority of industrial projects, after the specification phase, what will be the software and hardware parts constituting the future SoC (System-on-Chip) is chosen following ad hoc methods, often based on the designer experiences. Then, the development of the hardware part and the software part of the SoC is performed in two disjoined design flows. This is problematic because errors appear very late in the design process and modifying hardware/software partitioning requires a huge amount of work. The reason is the lack of tools during the partitioning phase. Several efforts were made to ease partitioning, by making possible the specification and simulation at system level, then refining it in an iterative way towards the final implementation. SpecC [GZD+ 00] and SystemC [OSC02] support such an approach inside a unified specification language based on C and C++, respectively. However, for some classes of applications modeled with SystemC, it is not currently possible to completely model the software behavior of the targeted architecture. Indeed, for the simulation of software modules, the SystemC simulator does not offer all the necessary functionalities, such as preemption or scheduling by priority, generally present in any RTOS: a joint refinement of the software and hardware parts is thus a tedious task in SystemC 2.0. A possible area for consideration in extending the SystemC core language is to provide better software support [GLMS02]. Unfortunately, the specification for this future release is not yet available. Consequently with SystemC 2.0, there are tools able to synthesize hardware modules [HLF02], but this is done at the expense of the software part.
There are also methods to refine the software modules [GLMS02], based on POSIX [Gal95] threads, recognized as a soft real-time operating system (rather than hard real-time). An RTOS model for SpecC was recently developed [GYG03]. RTOS modeling consists of abstracting operating system features at a high abstraction level. Their results have shown that simulation overhead introduced by the RTOS model is negligible while providing accuracy. During the refinement, the RTOS model is replaced by a custom RTOS. Interface and software synthesis for a wide range of commercial RTOS targets is currently under development, but still based on SpecC and its methodology. Several researches are thus concentrated on ways to simultaneously simulate software and hardware in a realistic manner with SystemC (or with C++) in order to perform the partitioning with a better understanding of the system. To consider the communication aspects between the software and hardware modules during simulation [SG00], we can intercalate an adapter. This adapter replaces the processor at high level, which will be selected in the final architecture. This makes possible the simulation of various architectures at high level. Nevertheless, since each processor has its own behavior, an adapter is required for each processor. Also, concerning the scheduling of the software modules, no OS (Operating System) implementation is supported: only foreground/background applications. Another possibility is to use an adapter that provides RTOS properties [CBG+ 02]. This allows to schedule several software modules in a sequential way such as an RTOS would do it. It also allows fast simulation. However, since the differences existing between various architectures are not taken into account, and the granularity of scheduling is coarse, the obtained simulation results may not be sufficiently precise for designers to make a confident partition choice. Moreover, modules simulating the software must be modified specifically: it is thus not possible to transfer easily the modules from the hardware to the software part (or vice versa), in order to test various configurations. [HPSV02] proposes a SystemC code used for the system-level specification, and after H/S partitioning, for the embedded software generation. Then, modules of the software part are written in SystemC and can be simulated at high level. Also, a SystemC/RTOS interface that makes possible modules scheduling based on an RTOS is presented. But again, they do not propose any environment to facilitate the architectural exploration (further explanations will be provided in Section 4). A possible refinement of [CBG+ 02] is to simulate more accurately the software interaction with the hardware, using an ISS. The ISS is generally a hardware module simulated by SystemC that accepts a binary code obtained by the cross-compilation of the software modules. Several researches were already undertaken to integrate an ISS with SystemC [ORHK02, BBB+ 02]. The results show that this integration is possible and that by using an ISS already written in C/C++ (like those provided by GNU [httc]), it is possible to quickly obtain a functional system. The resulting simulation is reliable and realistic because it depends on the actual architecture. Also, unlike the use of an ISS in VHDL, with SystemC the memory access method is simplified. This decreases largely the number of delta cycles necessary and accelerates significantly the simulation. On the other hand, these solutions focus more on the simulation aspects than on the partitioning methodology: the use of an ISS seems to take place after the partitioning phase. Furthermore, no proposal suggests the use of an RTOS making it possible to schedule several software modules on the ISS, so programs being executed remain of the foreground/background type, which is today a substantial limitation in SoC designs. In summary, no method currently exists that makes it possible to easily simulate at highlevel various hardware/software configurations, in order to obtain results leading to the optimum partition of a system. Two main conditions are required to reach this goal: • The possibility of moving modules between the software part and the hardware part without changing modules’ code.
• A simulation of the whole system giving realistic results to validate or invalidate a partition choice. The following section proposes a way to reach these two requirements: the use of an architecture simulated in SystemC integrating an ISS with an RTOS, and various mechanisms and interfaces of communication.
3
SPACE and its methodology
Our proposed methodology allows obtaining the ”optimum” partition of a system based on simulation results. First, the system is specified at the functional level (UTF/TF) in SystemC, but not partitioned. The construction of the modules representing the system must follow some rules, so that it will be eventually easy to move them from a hardware partition to software partition (and vice versa). The methodology forces the module to use only thread constructions (i.e. no SC METHOD) and to have only one single advanced input/output port. The coding style of the modules is thus close to the behavioral style and is not significantly restrictive. The first step is to simulate the system in a purely functional form with a SystemC transactional model named UTF channel, in order to check that it respects the functional aspect of the specification. Once the functionality validated, the next step is the partitioning stage. Modules tested previously are taken again, without modifications, and are placed in a simulation architecture, indifferently in the software or hardware part of the system. As mentioned, we have chosen to use an ISS in our architecture to jointly simulate the software part with the hardware part. Similarly to [PPB02], we use ISS models of the GDB debugger [htta]. As a first experiment, we have chosen the ARM processor [httb]. It has been encapsulated (wrapped) in a SystemC module with clock, reset, IRQ and input/output data ports. A SystemC wait() statement sensitive on the clock signal was added in the main loop of the ISS to synchronize SystemC with the ISS clock. The memory access functions of the ISS were redirected towards the input/output data port. Communication is achieved through this port using the read() and write() functions. These functions receive two arguments: address and data. This permits to have a higher model of the memory bus and have memory mapped I/O. As the memory (through its decoder) is the interface of this port, the reading and writing calls are ended up in only one delta cycle (except if we insert wait() to simulate a non-zero access time), which makes it possible to have a fast simulation. Modules placed in the software part of the architecture are thus cross-compiled and the binary code is placed in a memory module to be executed by the ARM ISS. As we want to have several modules in the software part, we also added the core of an RTOS, µC/OS-II, which will schedule the various modules. Moreover, as the modules are written in SystemC, an interface (API) has been created so that they can communicate with the RTOS. This RTOS and the interface are also cross-compiled with the modules since they belong to the code executed by the ISS. The details concerning the RTOS and the interface will be given in section 4. A number of modules are present in the architecture to ensure the correct operation of the ISS and the RTOS, such as the timer module and the interrupt controller module. Interrupts are necessary to ensure preemption between the software modules. Modules placed in the hardware part of the architecture are connected to the TF Channel through wrappers so that they can communicate between themselves. Software modules communicate though the API. Moreover, we inserted a decoder module between the ISS and the memory. This makes it possible to connect the ISS with the channel through the decoder and a wrapper. Therefore, as shown in Figure 1, software and hardware parts can communicate together.
User Module User Module
SystemC API
RTOS
... Wrapper ARM ISS
Decoder
TF Channel
Wrapper
User Module
Wrapper
User Module
Wrapper
...
RAM
Figure 1: User view of the platform Main services offered by an embedded software environment [Lab02] Manages time (low-priority tasks do not change system responsiveness) Allows multitasking (do more than one thing at the same time) Services (delays, semaphores, communication, synchronization... ) Response to real-time events Prioritize the work to be done
SystemC No Yes Partially No No
RTOS Yes Yes Yes Yes Yes
Table 1: The lack of SystemC 2.0 regarding embedded software environment
4
Embedded Software environment
As mentioned previously, the SystemC scheduler uses the same behavior for software simulation as for hardware simulation. Our goal is to provide the main services offered by the real-time kernel of an RTOS. Table 1 shows these services, partially available in SystemC 2.0. Rather than integrating an RTOS model directly in SystemC (similarly to [GYG03] with SpecC), the proposed SystemC API allows an RTOS to schedule software SystemC modules. Also, it allows a simulation of the software part very close to reality and very early in the partitioning process. Another advantage is the possibility to change this operating system by another in order to satisfy the system’s specifications.
4.1
Software Organization
The objective is to obtain a binary file that will be executed by the processor (initially an ISS) of the platform. Three parts constitute the binary code (form left to right in Figure 2): • The user application written in SystemC: SystemC software modules. • The SystemC API: in charge of 1) the initialization process, 2) the communication between the RTOS and the SystemC application, and 3) the communication between software and hardware modules (through the communication manager). • The RTOS: contains the scheduler of the software tasks and the HAL [YJ03]. The HAL represents the Hardware Abstraction Layer (which is processor dependent) used to control the ARM processor and the hardware architecture.
SystemC API
RTOS (µC-OS/II)
Initialization Process User module
User module
SC_CTHREAD()
OSTaskCreate()
sc_start()
OSStart()
wait()
OSTimeDly()
sc_mutex.lock()
OSMutexPend()
sc_mutex.unlock()
OSMutexPost()
...
...
HAL -ISR -context switch -stack init -timer tick -exception vectors -processor init scheduler
Communication manager
Figure 2: Software environment to connect software SystemC module to the RTOS The first RTOS to be integrated in our platform is µC/OS-II [Lab02]. It offers all the advantages of a real-time operating system: a preemptive kernel, a priority based task scheduler and an interrupt system. µC/OS-II has been selected for its low complexity, the availability of its source code and because it has successfully been ported to a vast range of processors. In the rest of this section, the SystemC API and the commercial RTOS blocks of Figure 2 are detailed.
4.2
SystemC API
First, the SystemC API is responsible of the initialization process. It represents a very significant phase because it allows building the entire software environment on which the application will be executed. First, the API uses the SystemC initialization mechanism to obtain the process list. Then, the initialization of the RTOS (µC/OS-II) is called. Each process generates the creation of a µC/OS-II task. Then, finally, the RTOS scheduler starts. Next, the SystemC API provides the mapping between the RTOS functions and the system functions proposed in SystemC 2.0. These functions manage processes, modules, interfaces, channels and communication ports. The work described in [HPSV02] uses a similar interface including an RTOS. The difference is that we focus on the partitioning mechanism in order to migrate modules between hardware and software seamlessly. Then, we use a specific communication manager included in our SystemC API. Software modules are connected to a specific communication port and use it to send or receive data from other modules. The SystemC API also provides a communication manager. It establishes connection between software modules and the platform (hardware part). It answers task requests to communicate with other software or hardware modules. The construction of the modules representing the system follows a ”standard”, so that it will be eventually easy to move them from a hardware partition to a software partition (and vice versa). Any communication to a module will have to use a specific channel with specific methods: read() and write(). The aim of the software communication manager is to model communication between connected modules. This manager provides the same communication model than the hardware channel. A module connected to this manager will be registered like being a software module. Through the manager, a module can access peripherals directly (memory, timer, etc.) or can send a message to another software module or to a hardware module (Figure 3). In the last case, the manager can communicate with a specific hardware module on the platform and then transfer the message to it. Similarly, modules can receive messages from software modules, or hardware modules. To receive a message from a hardware module, an interrupt is triggered and
then a routine is executed to receive the message. The message is then sent to the communication manager so that the software module can get it. Finally, note that SystemC provides several data types, in addition to those of C++. These data types are mostly adapted for hardware specification. In order to support hardware/software partitions, the SystemC modules will have to be functional in the hardware part as well as in the software part of the platform. The data types specific to hardware are implemented in the software SystemC API to allow the simulation of the software part. It implies that all SystemC keywords are overloaded and all hardware specific data types, redefined. For example, bit vectors are not used in software designs, but to allow partitioning, these types are included in the SystemC API. Software part
Hardware part
SystemC API / RTOS
User Module
HAL
Address Decoder
ISS Adapter
IRQ
IRQ Manager
to TF Channel
Communication manager
User Module
ISR (hardware message received)
Binary code executed by the ARM ISS
Figure 3: Software communication
4.3
The RTOS
µC/OS-II has been ported to an ARM processor. Our port of µC/OS-II for the ARM has been developed using the port available on the website of µC/OS-II, and the version provided by ARM tools using the µHAL library for Firmware development. To build and debug the ARM binary, cross-tools (compiler, linker and debugger) from GNU [httc] are used. In order to be as much as possible independent of the hardware, the operating system uses a software layer called HAL. This layer allows the operating system to disregard hardware architecture on which it runs by offering standard interface functions to the hardware architecture. This includes: • Register managing: used when the operating system does a context switch, scheduling the highest task priority, initializing task stacks; • Memory mapping: address of the code RAM, data RAM and other devices; • IRQ manager: these functions allow software to control interrupts from the timer and hardware modules; • Timer: these functions allow software to manage the platform timer to periodically generate an interruption.
5
Hardware support
To ensure that communications will be preserved whether we decide that certain modules will be hardware and others software, each module has a unique identifying ID number, given by the
Blocks Modules
Access method ID numbers
Devices
Address ranges
Definition Application specific blocks, can initiate transactions towards other modules or devices, created by designer. Slave blocks, can only give response to module requests, provided with the platform.
Table 2: Blocks that can be connected on the UTF & TF Channels
system designer. Communications work as on a network and data are encapsulated in a packet with a structured header that contains sender’s ID, target’s ID and the size of the message, so that they can be routed correctly. If software and hardware modules want to communicate together, a special device called the ISS Adapter is used. Other useful devices such as RAM, interrupt manager and timer are also provided with the platform. There is a difference between what is named Modules and Devices. Table 2 gives appropriate definitions about these blocks.
5.1
Abstraction level
Since an ISS is used in our architecture, a cycle accurate hardware channel to perform timed simulations is an interesting feature. Even though our timed channel (TF Channel) is synchronous and may contain an arbiter, it remains at a functional level, because the simulation must be fast and must abstract RTL details. We also support a faster way of communication named UTF Channel that performs a complete simulation of an application, before the partitioning phase, where all modules and devices are linked together. Before describing in details our communication mechanisms, it is significant to discuss their level of abstraction. Four abstraction layers are recognized [HLWW02]. Our UTF Channel corresponds to level L-3 (Message layer), while our TF Channel matches best the level L-1 (Transfer layer). At the Transaction layer (L-2), communications can be timed but are not cycle-accurate. If we wanted to perform a simulation at this level, we could add wait(simulation delay) calls into users’ modules connected to our UTF Channel. As L-2 is not yet implemented in SPACE, for the moment we omit this abstraction level in our methodology. However, an application developed and tested at a high level of abstraction (L-3) can be integrated easily on the platform (L-1), without refining by the intermediate level L-2. Compared to L-1, the main advantage of L-2 is a simulation time speedup.
5.2
UTF Channel
The UTF Channel is useful at the highest level in the design process. At this level, it is not significant to consider if the modules described in SystemC will be implemented in software or in hardware. The goal of the UTF Channel is to allow a quick verification of an application. To reach this goal, communication between several modules must take a reasonable amount of time. It happens that a way of interconnecting them without having to suffer from a slow ISS is necessary. The way to achieve this is to keep the communication model as simple as possible, and to focus on untimed message passing. This feature enables us to validate the system without the whole platform, i.e. without a bus protocol, a microprocessor and a real-time operating system. The communication channel implements a standard interface for all the modules that can be connected using their single port. Data of arbitrary size and type can be exchanged between the modules using this interface of the channel, which supports both blocking and non-blocking communication calls in the same simulation.
5.3
TF Channel
Following the design of an application that is carried out on high level, with the UTF Channel, we propose to replace the communication mechanism with another one: the TF Channel. The main refinement between the UTF Channel and the TF Channel is that the data transfers consume clock cycles. Hardware/software partitioning can be obtained following the analysis made with simulations on the platform, at the TF level. To evaluate multiple suitable partitions, one of the main selected measures is the number of needed clock cycles for a complete execution of an application. For an application already conceived and verified on the UTF Channel, the user will be able to test various software and hardware configurations. For each configuration, the user starts the simulation and stops it at one precise moment, then notes the number of clock cycles used at this time (for instance by using the function sc simulation time()). The hardware/software configuration that is estimated as being the best is, in general, the one that will satisfy timing constraints, while minimizing hardware. The software modules consume clock cycles while being executed, because the processor takes a precise number of execution cycles for each assembly instruction. The hardware modules are considered as being powerful calculating units and consume clock cycles when they communicate or when the designer uses explicitly a wait() statement. Finally, area estimation could be performed using commercial tools [HLF02]. In SPACE different communication models could be evaluated. For the moment only the bus and the crossbar models have been implemented. To emulate a bus behavior, processes wait a certain amount of time that is proportional to the message size to be transmitted. Moreover, the bus must be used by only one process at a time, because it is a shared resource. This characteristic tries to emulate a single bus for all the modules. To simulate the serialization of TF Channel requests and answers, an arbiter was used. In this way, there cannot be more than one process transferring data on the TF Channel at the same time. The protocol delays have been modeled, but kept very simple: only the transfer time and arbitration time is calculated. Arbitration takes 1 clock cycle and transfer takes 1 cycle per 32 bits chunk. Figure 4 gives an overview of the platform’s architecture. We can see that there are adapters between user modules and the TF Channel. These components will enable us to have an independent communication interface for modules thus making it possible to use a different bus protocol if desired. Adapters are also essentials in our proposed architecture because they are used as message buffers. They can retain read() calls so that only write() calls will travel on the TF Channel. This enhances performance since half of the TF Channel traffic is eradicated.
6
An example and its simulation results
An example was elaborated to illustrate the benefit of SPACE and its methodology. We could imagine such an application for audio or video data processing. Figure 5 is a functional diagram of the example and illustrates the five constituting modules and their communication dependencies. Integer data is first produced by the Producer module (1), filtered by the Filter module (2) and stored in a buffer (reserved memory range) by the Mux module (3). Periodically, the Controller module wakes up and tells the Mux to store data somewhere else in the memory (4). Then the Controller asks the Analyzer module (5) to use previously stored data to produce a result from a simple calculation (6). The result is needed to adjust filter coefficients (7). After that, the Controller waits until its next execution period. The example was written following our methodology. The application was first coded in SystemC and modules were connected to our high-level untimed fast simulating communication channel (i.e. the UTF Channel). With negligible effort, modules were taken as is and were connected to the SPACE platform. In that way we had a 100% hardware version for the application. Finally, a partitioned approach was
clk
timer
User Module
arbiter
n_irq_timer
n _irq
n _fiq
irq_manager module_adapter n_irq_iss_adapter
armiss
TF Channel
n_reset
iss_adapter
decoder
User Module
device_adapter module_adapter
ram_code
ram_data
Figure 4: General hardware architecture New destination address for incoming data
Producer
1
Data (32 bits)
Filter
2
Data (32 bits)
Mux
4 New filter coefficients
Data (32 bits)
7
3
5 Controller
Ask for analysis
Analyzis Result
Analyzer
Analyzes a memory region
RAM
6
Figure 5: Example with the SPACE platform tested, that is with the Controller and Analyzer modules constituting the software part, and the Producer, Filter and Mux modules constituting the hardware part. Simulation results are presented in Table 3. For every version, 10240 integers (32 bits) were produced by the Producer module. Results show that simulation is very fast using the UTF Channel. This allowed us to debug and verify the functionality of the whole system. Despite the fact that this version is untimed, we used a SystemC wait() statement in the Producer thread to model a periodic data generator. Also, the same strategy was used in the Controller to ensure a periodic execution. This is why the simulated time (in clock cycles) in non-zero using the UTF Channel. With the next version, at the TF level, communication between modules consumes clock cycles and we see that simulation time is affected. The simulation was a bit slower because we had introduced a global clock that synchronizes every thread. On the other hand, this version provided more accurate simulated time results. Finally, the last version shows that both hardware and software threads can execute in parallel to provide interesting simulation results. This version has been produced by compiling modules with our SystemC API and the µC/OS-II library for the ARM ISS. By moving the Controller and the Analyzer modules to the software side and keeping the Producer, the Filter
Example versions UTF Channel 100 % hardware (TF) Partitioned system
Simulated time (cycles) 2.3245 E06 4.6912 E06 5.6262 E06
Simulation time (sec) 1 42 66
Table 3: Simulation results for the example (Pentium III 600 MHz, 128 MB of RAM)
and the Mux modules in the hardware part, similar performances (less than 20% more cycles than the hardware version) can be obtained. This also results in a much more reasonable simulated time increase. Many other partitions could be tested and one that satisfies time constraint specifications could be chosen as the final system.
7
Conclusion and future works
In this work, we have shown how our architecture with an ISS and an RTOS creates the link between two abstraction levels and provides an easy and practical way to partition an application, based on simulation results. It allows high-level simulation of hardware/software applications coded in SystemC by respecting the behavior of the two parts while being sufficiently precise to validate a partition choice and by allowing the permutation of the application modules. Simulation results could give more information to the system designer if we integrate a software profiler and hardware surface and power consumption estimators. The use of an ISS is essential to support full RTOS functionalities and provide precise results. However, similarly to [CBG+ 02], we are currently looking for a software layer that could provide high-level software emulation and faster simulations. This layer could be incorporated in our methodology as an earlier step in the application refinement process. The platform architecture is for the moment very simple but we intend to try other RTOS, processors, and channel protocol models [PPB02, HLWW02] and extend the architecture to support multiple processors. In particular, we are currently comparing our approach to the Disydent project [ha] and we are considering their multiprocessor RTOS named Mutek as a candidate to be ported on SPACE. Having multiple ISS running in the same simulation may be realistic if we use a distributed simulation mechanism. Finally, it could be interesting to incorporate existing high-level models of other communication protocols. It is possible for example to create a model that is functionally equivalent to OCP (Open Core Protocol) [httd], without however modeling all the structures normally needed by the protocol [PPB02]. That would make it possible to have a model which is fast to simulate and which in spite of this reflects reality in terms of clock cycles (cycle true). Having several channel models to choose from could also allow exploration of the system interconnect.
References [BB03]
M. Besana and M. Borgatti. Application Mapping to a Hardware Platform through Automated Code Generation Targeting a RTOS: A Design Case Study. pages 41–44. Proceeding of Design, Automation and Test in Europe Conference and Exhibition (DATE03) Design Forum, March 2003.
[BBB+ 02]
L. Benini, D. Bertozzi, D. Bruni, N. Drago, F. Fummi, and M. Poncino. Legacy SystemC Co-Simulation of Multi-Processor Systems-on-Chip. ICCD, 2002.
[CBG+ 02] W. Cesario, A. Baghdadi, L. Gauthier, D. Lyonnard, G. Nicolescu, Y. Paviot, S. Yoo, A.A. Jerraya, and M. Diaz-Nava. Component-Based Design Approach for Multicore SoCs. ASP-DAC Yokohama, 2002. [Gal95]
B. O. Gallmeister. Programming for the real World, POSIX. 4. O’REILLY, 1995.
[GLMS02] T. Gr¨otker, S. Liao, G. Martin, and S. Swan. System Design with SystemC. Kluwer Academic Publishers, 2002. [GYG03]
A. Gerstlauer, H. Yu, and D. Gajski. RTOS Modeling for System Level Design. pages 130–135. Proceeding of Design, Automation and Test in Europe Conference and Exhibition (DATE03), March 2003.
[GZD+ 00]
D. D. Gajski, J. Zhu, R. D¨ omer, A. Gerstlauer, and al. SpecC: Specification Language and Methodology. Norwell, MA : Kluwer Academic Publishers, 2000.
[ha]
http://www asim.lip6.fr/disydent/.
[HLF02]
S. Holloway, D. Long, and A. Fitch. From algorithm to SoC with SystemC and CoCentric System Studio. Synopsys Users Group, 2002.
[HLWW02] A. Haverinen, M. Leclercq, N. Weyrich, and D. Wingard. White Paper for SystemC based SoC Communication Modeling for the OCP Protocol. 2002. [HPSV02]
F. Herrera, H. Posadas, P. S´anchez, and E. Villar. Systematic Embedded Software Generation from SystemC. pages 142–147. Proceeding of Design, Automation and Test in Europe Conference and Exhibition (DATE03), 2002.
[htta]
http://sources.redhat.com/gdb/.
[httb]
http://www.arm.com.
[httc]
http://www.gnu.org.
[httd]
http://www.ocpip.org.
[htte]
http://www.qnx.com.
[httf]
http://www.windriver.com.
[Lab02]
Jean J. Labrosse. MicroC/OS-II, The Real-Time Kernel, Second Edition. CMP Books, 2002.
[ORHK02] I. Oussorov, W. Raab, U. Hachmann, and A. Kravtsov. Integration of Instruction Set Simulators into SystemC High Level Models. Euromicro Symposium DSD, 2002. [OSC02]
OSCI. SystemC Version 2.0.1 User’s Guide. http://www.systemc.org, 2002.
[PPB02]
P. G. Paulin, C. Pilkington, and E. Bensoudane. StepNP: A System-Level Exploration Platform for Network Processors. pages 17–26, Nov.-Dec. 2002.
[SG00]
L. Semeria and A. Ghosh. Methodology for Hardware/Software Co-verification in C/C++. ASP-DAC Yokohama, 2000.
[YJ03]
S. Yoo. and A. A. Jerraya. Introduction to Hardware Abstraction Layers for SoC. System-Level Synthesis Group (DATE03), 2003.