Development of Heterogeneous Multi-core Embedded ... - CiteSeerX

8 downloads 46520 Views 965KB Size Report
However, such architecture brings software development more complexity ... When multi-core SoC is employed for automotive applications, the system will be transformed from .... The load balancing is managed by the task manager on ARM.
2011 International Conference on Circuits, System and Simulation IPCSIT vol.7 (2011) © (2011) IACSIT Press, Singapore

Development of Heterogeneous Multi-core Embedded Platform for Automotive Applications Ting-Ying Wei, Zhi-Liang Qiu, Chung-Ping Young+ and Da-Wei Chang Department of Computer Science and Information Engineering National Cheng Kung University, Tainan, Taiwan

Abstract. Car electronics dominates the functionality of a modern automobile. To meet the requirements of low cost, high performance, compact size and versatile operation, multi-core system-on-chip (SoC) embedded systems play an important role for system architecture and development. To accommodate a variety of control and computation requests of embedded systems, a heterogeneous multi-core processor can satisfy different types of computational tasks. However, such architecture brings software development more complexity and challenges. Dual-kernel embedded software was developed for PAC Duo, which consists of one ARM processor and two PAC DSPs developed by Industry Technology Research Institute (ITRI), Taiwan, with Linux on ARM processor and μC/OS-II on PAC DSP. An inter-processor communication (IPC) mechanism, which takes advantages of hardware features, was developed to fulfill the heterogeneous multicore interconnection. The real-time process migration between DSPs was realized for load balancing enhancement. Therefore, the heterogeneous multi-core system software will be suitable for automotive applications.

Keywords: Automotive, heterogeneous multi-kernel, system-on-chip, PAC

1. Introduction A modern automobile is designed to be safer, more convenient and more comfortable than ever, while car electronics plays an important role for automobile development. Because of the advance of integrated circuit technology, an electronic control unit (ECU) not only shrinks the size and reduces the power consumption, but provides more complicated functionality and data processing capability. In order to achieve the increasing demands for telematics and infotainment in a vehicle, the processor of an ECU needs to boost its computing power for computation intensive applications. Moreover, all the ECUs in an automobile are connected through in-vehicle communications, like CAN, LIN, or FlexRay bus, to organize a distributed system. Since there are as many as 70 ECUs embedded in a vehicle [1], [2], this conventional approach implies higher cost for utilizing more components, occupying more space, and maintaining robust interconnection among ECUs. Multi-core system-on-chip (SoC) is a popular solution for development of advanced computers or embedded systems. It features with parallel processing, compact chip size and lower power consumption. When multi-core SoC is employed for automotive applications, the system will be transformed from federated architecture into integrated architecture [3]. The performance is improved, the amount of processors is reduced and better reliability of the on-chip communication is assured. Multi-core processor can be categorized into two types: homogeneous and heterogeneous. Symmetric multiprocessor (SMP) operating systems are usually implemented on homogeneous multi-core processors for high performance clustering computing. On the other hand, heterogeneous multi-core processors, consisting of diverse cores dedicated to specific applications, are better for embedded systems. Texas Instrument +

Corresponding author. Tel.: + 886-6-2757575 x62533; fax: +886-6-2747076. E-mail address: [email protected]. 193

OMAP [4] and DaVinci [5], and Industry Technology Research Institute (ITRI) Parallel Architecture Core (PAC) [6] are examples, where they comprise an ARM-based general purpose processor (GPP) and at least one digital signal processor (DSP). However, the complexity of embedded software development grows, because of various hardware abstraction, system software, and inter-processor communications (IPC). A heterogeneous multi-core aware embedded software platform was customized and implemented on the ITRI PAC for a variety of automotive applications, while the Linux is on ARM processor and the μC/OS-II is on PAC DSP. The IPC mechanism, which takes advantages of hardware features, was designed to fulfil the heterogeneous multi-core interconnection among ARM and DSPs [7]. The real-time process migration between DSPs was realized for load balancing and fault tolerance enhancement.

2. System Architecture In order to develop the heterogeneous multi-core embedded software and applications, the ITRI PAC Duo platform connecting with other ECUs via CAN bus is employed for implementation.

2.1. PAC Duo SoC PAC Duo SoC is a chip-level heterogeneous multiprocessor SOC composed of an ARM926EJS and two PAC DSP cores of the same architecture [8]. The ARM926EJS serves as the GPP while two DSPs can be treated as special purpose processors (SPP) to cooperate with the GPP. PAC DSP is a five-way VLIW DSP core and includes a scalar unit, two load/store units, and two arithmetic units. It uses distributed register file with low access latency and power consumption, and it utilizes variable-length operation encoding techniques to increase the code density [9], [10]. Each DSP core in PAC Duo has a 64 KB local memory and resides on the 32-bit AXI bus. Communication with the ARM processor can be achieved through an AXI-AHB bridge, since the ARM processor resides on the 32-bit AHB bus. PAC Duo supports inter-processor communication at hardware level through hardware mailbox mechanisms or shared memory [8]. The former is interrupt-driven, allowing a processor to send interrupts to another processor for event notification. The latter allows the processors to share data or states. There are four banks of shared memory on the platform. Two banks of shared memory (128 KB SRAM and 128 MB DDR2 DRAM) reside on the AXI bus while the other two (256 KB SRAM and 128 MB SDRAM) reside on the AHB bus [8].

2.2. Software Implementation Fig. 1 depicts the software architecture of heterogeneous multi-kernel embedded software on the PAC Duo platform. A Linux kernel, which runs on the ARM core as a master processor of the system, is for I/O control and system management. The real-time kernel μC/OS-II, which runs on DSP1, DSP2, or both DSPs, is mainly for real-time signal processing computation. This multi-kernel architecture allows an application to be programed as a real-time task on μC/OS-II or a non-real-time task on Linux. Communications among the kernels are achieved through an IPC mechanism, which intends to support efficient communication and data sharing among software running on these cores. The realization of IPC is separated into two parts, which are integrated into each kernel respectively.

Fig. 1: Software architecture of the multi-kernel embedded system on the PAC multi-core platform. 194

After three major modules, Linux kernel, μC/OS-II kernel and IPC, are installed, there are several aspects to be considered for enhancing the system performance and stability. 1)

Real-time scheduling

μC/OS-II is a preemptive kernel, so the execution of a processor is always given to the highest priority task ready-to-run [11]. The tasks to be executed on μC/OS-II are compiled along with the kernel off-line, so each task’s execution time, resources, dependencies and time constraints should be known first before priority selection. However, this static task assignment may not be operated perfectly for automotive applications, when the tasks are assigned dynamically for variable driving environment. A dynamic global scheduling policy is required, so the dispatcher on the ARM can dispatch processes to DSPs by predefined algorithm or user’s policy. This global multi-core scheduling selects the highest priority task from the global ready queue and assigns the task to a free processor. The state diagram in global scheduling involves 5 states: NEW, READY, RUNNING, WAITING and TERMINATED. When a task is initiated, it’s in READY state and the task is placed in the ready queue. The scheduler will pick a task from ready queue and the state is transferred to RUNNING, where the task assignment will be implemented. If a running task is interrupted, the task will be put back to the ready queue. Since there are two identical DSP cores in the chip, the task can be freely dispatched to either DSP where it can run more efficiently. The system software provides more flexibility of process management and then achieve higher throughput. This is done by migrating a waiting process on an overloading core to an idle core. 2) Load balancing The load balancing is managed by the task manager on ARM. It probes the overloading of any DSP core and initiates a process migration. Based on PAC Duo architecture, μC/OS-II is implemented on each PAC DSP, and the shared memory, both DDR2 memory and DSP local memory, can be accessed from either DSP. Some system calls of μC/OS-II supporting process migration between DSPs are developed. When the process migration system calls of μC/OS-II are invoked, the mailbox is employed to transfer migration information. If a migration is requested, the source core will save migration information and then send the data structure to the target core. After the migrating process is frozen, the source core sends migration requirement to the target core. If the migration is allowed, target core will start executing the migrated process immediately. If not, the target processor will return a fail message to the source core. Figure 5 presents the common code and data section in shared memory being accessed by both cores during process migration. The following procedure describes the process migration, which is shown in Fig. 2. A. The ARM core sends migration requirement (migrate the process A to DSP2) to DSP1. B. After DSP1 receives the migration request, it suspends the process A, packets the migration information and sends migration packet to DSP2. C. When DSP2 receives the migration information, DSP2 will examine its state and response the result to DSP1. If migration requirement be permitted, DSP2 will send succeed message to DSP. If the result is fail, DSP2 send the fail message to DSP1. D. DSP1 will receive the migration result. If fail, it will resume the suspended process and send fail message to ARM. If migration is successful, DSP1 send success message to ARM.

Fig. 2: The procedure of task migration. 195

3) Fault tolerance For automotive applications, robustness is one of the most important design metrics. If one component doesn’t work, it will bring down the whole system. DSP is one of the core components to process all incoming digital signals, so dual DSPs not only enhance the computation capability, but also provide redundant operation setting and fast recovery mechanism. The watchdog timer is started and its service routine investigates the hardware status and software flow. When one DSP is diagnosed as failure or the program is hung, all processes assigned to that DSP will be migrated to the other one. A task manager at ARM side is responsible for task dispatch and process migration between the DSPs. This mechanism first records the states at previous check point and resets the states to the designated instance before the process migration. This prevents from system reset and data loss, and shortens the recovery time.

3. Automotive Applications and System Implementation Vehicle safety is the most concerned issue for automotive applications. Active safety is a vigorous approach to bring forward the precaution enabling to prevent the driver, passengers, and vehicle from a possible accident. Collision avoidance is one of the vital features in some luxury cars. On the other hand, infotainment is not a safety critical feature, but it provides convenience and comfort to the passengers and is a value added factor to a vehicle. To fulfil a collision avoidance function is to determine the safe approaching speed to the front car by detecting the relative speed and distance through one or several types of sensors, like video, laser, or ultrasound, etc., and converting these parameters into one collision avoidance index. Fig. 3 shows the block diagram of a multi-function telematics application developed on a PAC Duo platform. While the DSPs are handling the multi-sensor data fusion, the ARM manages user interface and communication. The Linux kernel on ARM, involving internet protocol stack and 3G modem device driver, provides the pervasive networking capability to applications. Fig. 3 also demonstrates a telematics application for transmitting the dynamic vehicle driving information to the server in service center through 3G communication and internet.

Fig. 3: A telematics application integrating sensors and 3G modem.

Fig. 4 demonstrates the system realization of PAC Duo platform connecting with a variety of sensors through CAN bus for automotive application.

196

Fig. 4: System realization of PAC Duo platform for automotive applications.

4. Conclusion Heterogeneous multi-core SoC becomes popular for a variety of embedded systems, while the multikernel embedded software development is complicated. This work shows the system software implemented on a heterogeneous multi-core SoC platform, PAC Duo, and describes its relevant automotive applications. To improve the performance and reliability, the system software is enhanced for load balancing, fault tolerance and power management. The PAC Duo platform should be capable for future automotive applications development.

5. Acknowledgements This work was supported in part by the National Science Council of Taiwan under Grants NSC-98-2220E-006-019 and NSC-99-2220-E-006-009.

6. References [1] N. Navet and F. Simonot-Lion, Automotive Embedded Systems Handbook, Taylor & Francis, 2009. [2] N. Navet, A. Monot, B. Bavoux, and F. Simonot-Lion, “Multi-source and multicore automotive ECUs - OS protection mechanisms and scheduling,” 2010. [3] H. Kopetz, R. Obermaisser, C. El Salloum, and B. Huber, “Automotive Software Development for a Multi-Core System-on-a-Chip,” Fourth International Workshop on Software Engineering for Automotive Systems (SEAS'07), 2007. [4] Texas Instruments, OMAP5910 Dual-Core Processor (Rev. D), August 2004. [5] Spectrum Digital Inc., DaVinci EVM Reference Manual, 2007. [6] Industrial Technology Research Institute, PACDSP3S0001-Processor Architecture, 2008. [7] D.-W. Chang, et. al., “Building Multi-Kernel Embedded System on PAC Multi-Core Platform,” Proc. WESQA, 2010. [8] Industrial Technology Research Institute (ITRI), PAC DUO Programming’s Reference, 2009. [9] T.-J. Lin, C.-N. Liu, S.-Y. Tseng, Y.-H. Chu and A.-Y. Wu, “Overview of ITRI PAC project - from VLIW DSP processor to multicore computing platform”, Proc. IEEE International Symposium on VLSI Design, Automation and Test (VLSI-DAT 2008) , pp. 188-191, 2008. [10] C.-W. Chang, I.-T. Liao, S.-Y. Tseng and C.-W. Jen, “PAC DSP Core and Its Applications”, Proc. 2006 IEEE Asian Solid-State Circuits Conference (ASSCC 2006), pp. 19-22, Nov. 2006. [11] J. Labrosse, MicroC/OS-II: The Real-Time Kernel, 1999.

197