USE OF MULTICORE PROCESSORS IN AVIONICS SYSTEMS AND ITS POTENTIAL IMPACT ON IMPLEMENTATION AND CERTIFICATION Larry M. Kinnan, Wind River, Stow, Ohio
Abstract With the wide availability of multiple core (multicore) processors, their reduced space, weight and power (SWaP) properties make them extremely attractive for use in Avionics systems. In order to implement a solution on a multicore platform, the developer will be confronted with numerous implementation and certification issues that are not present in unicore or discrete multiple processor implementations. These issues involve both hardware and software aspects of certification and the interoperation of the two. This paper will provide guidance to the developer on the issues that must be addressed from both a hardware and software aspect in order to understand the potential and limitations of multicore solutions.
Current multiple uniprocessor implementation In order to achieve system reliability requirements, most modern avionics systems employ fault tolerant design in order to achieve its goals. This is true for both military and commercial aircraft systems although military systems typically do not obtain DO-178B/DO-254 (ED-12B/ED-80) certification for their aircraft [2][3] (typically military programs are compliant rather than actually certified). To provide fault tolerance a number of methods are employed including redundant computing elements, multiple communications paths and other techniques[4]. Avionics computer platforms typically now include multiple, discrete processors in duplex, triplex or quad arrangements forming a redundant set allowing for redundant computation steps which are then compared either on a real time basis or at pre-defined synchronization points in the computational time line. Each processor has its own memory and cache systems that are completely isolated from the other processors contained in the module (typically these modules are single printed circuit cards containing the processors). This provides a significant fault isolation
mechanism that prevents unintended interactions at the processor level. These processors are then interconnected by some type of hardware interconnect fabric, typically an Field Programmable Gate Array (FPGA) in order to allow for synchronizing their operation either in real time or at pre-defined synch points in the computation time line. In these types of multiple processor systems, the results are compared and should a discrepancy occur the majority of the processors that agree are deemed to be correct. The processor that is in disagreement is voted out of the set and remedial action is then taken, this action typically takes the form of a reset or restart of the processor to determine if the error was a single incident (event) or if a more fundamental failure has occurred. Should it be determined that the failure is not a single point event that processor is effectively removed from the system operation and the aircraft continues to safely operate. Additionally, pertinent information is also logged to non-volatile storage to allow for later maintenance once the aircraft is on the ground. One such notional system is shown in Figure 1 implementing a dual processor arrangement [5]. DO-254/ED-80 hardware certification relies upon what is referred to as service experience in order to augment design assurance when using commercial, off the shelf (COTS) components such as processors discussed above [1][3]. This allows the developer and certification authority to take and acknowledge industry life cycle experience for parts such as processors, FPGAs, ASICs, etc. This permits the developer to take credit for this service experience (history) so long as it is clearly identified up front in the Plan for Hardware Aspects of Certification (PHAC) and documented in the Hardware Accomplishment Summary (HAS) as mandated by DO-254/ED-80[2]. The capability service experience provides for these COTS components during certification will be critical to our discussion of multicore processors and their use in avionics systems.
CPU
1
CPU
2
Comparison Unit RAM
RAM
Enable Interface Unit
Interface Unit
Enable
Bus A
Bus B Figure 1. Notional dual processor avionics compute element
Multicore processors The drive towards multicore solutions has been primarily due to the physical limits of semiconductor electronics. These limits include heat dissipation and computational capacity limits. Multicore processors provides a method of increasing scale without the inherent limitations of multiple unicore processing elements particularly in the areas of size, heat and power consumption (also known as SWaP – Space, Weight and Power). Due to these limits, there are finite limits to how fast the operating frequency can be increased. By using multiple, proven processing cores on a single underlying die along with reduced lead runs significant gains can be achieved in many areas of performance. This close proximity of the individual cores also allows for the cache coherency circuitry to operate at much higher clock rates than is physically possible with discrete uniprocessors. While these advantages offer a compelling reason to multicore solutions, there are some drawbacks. Multicore processors are typically more expensive to manufacture and have lower yields in production that tend to drive costs upwards. While these can be
detriments to their use, the advantages to developers are clear and the industry is continuing in this direction in order to avoid obsolescence. In order to take advantage of multicore processors, design changes need to be made to the software running on them. Simply adding additional cores does not geometrically improve performance of the software. To gain significant performance the software must make use of thread level parallelism to efficiently use the multiple cores. The other area software can benefit by use of multicore processors is to run virtual machines on each core since they can run independently and execute in parallel. This virtual machine concept also allows for partitioning of functions that can help to achieve safety through isolation of these functions.
Multicore hardware certification issues While most multicore processor designs use existing core designs arranged in two, four, eight or sixteen (or more) core arrangements, the ability to use service experience in these processors is reduced considerably since the implementation is new and has limited service history, especially in
the area of avionics computational elements requiring certification. This coupled with the conservative approach used by certification authorities can lead to significant hurdles to hardware certification under DO-254/ED-80 guidelines and processes[1][3]. In addition, most semiconductor manufacturers view the aerospace market as ancillary to their primary markets of networking and consumer devices since they the volume for multicore chips is quite small compared to these primary markets. This has led to underlying design fabric choices in multicore chips that may make achieving hardware certification extremely difficult or impossible in some cases. The following examples illustrate some but not all of these issues.
Shared cache certification issues A significant number of multicore chip designs provide separate L1 instruction and data caches but then share a common L2 cache between cores. This can be seen in the notional multicore chip layout shown in Figure 2. Use of L2 cache is extremely important to performance especially in time and space partitioned operating systems used in Integrated Modular Avionics (IMA) core modules. Having both cores share a common L2 cache can create a number of undesirable effects and interactions. The first such effect is the possibility that one core could block the other core’s access to the L2 cache which could lead to significant processing delays and hence non-determinism in the operation of the overall system. This is especially true if both cores are running the same software in a synchronized fashion. To avoid such issues, some multicore chips permit the L2 cache to be segregated such that each core has exclusive access to their own private segment of L2 cache. While this may avoid the possible contention issue as well as the intermixing of data within the cache leading to cache thrashing, it effectively reduces the available L2 cache to each core thereby reducing the performance of the software. Segregation of the L2 cache also typically prevents the use of the built in hardware invalidate of the L2 cache since all of the L2 cache would be invalidated. This has significant performance impacts in a time and space partitioned operating system since the caches must be flushed on each partition switch in order to maintain low jitter margins for determinism and
hence certification. One positive note is that new multicore parts are being developed and will be available shortly that have completely separated and isolated L2 cache memory and control on a per core basis that should eliminate this particular issue.
Shared peripherals coherency fabric certification issues Most multicore chips as well as a number of uniprocessor chips are commonly referred to as System on Chip (SoC) meaning that they not only contain computational core(s) but also additional specialized peripherals such as Ethernet, serial and other I/O as well as possibly other specialized elements that are available for use by any of the cores in the chip. In order to provide orderly access to these shared peripherals these chips implement what is called a coherency fabric to arbitrate those accesses. In order to assign access in a hierarchical fashion, this coherency fabric implements a priority scheme typically set by default by the chip manufacturer. In most uses, such as network or consumer devices, this pre-assigned priority is of little concern. But in a safety critical system that is destined to be certified such configurations cannot be ignored since they may prevent access by one or more cores to time critical usage of the peripheral. This has led in at least one case of access being blocked to a PCI-Express bus in a major program that was discovered during certification testing. In a number of manufacturer’s cases, how to configure these priorities beyond the default settings is not clearly documented in the manufacturer’s data sheets. Some of the hesitancy regarding disclosing the internal registers of this coherency fabric is related to exposing details of the underlying intellectual property of the manufacturer. This runs counter to the required transparency needed to achieve certification and places the certification authority and the manufacturer in conflict since hardware certification is based on service history and direct access to the underlying implementation of the fabric being used to communicate between cores and the external world. It is clear that in order to resolve this issue, semiconductor manufacturers will need to provide transparency in their designs sufficient to allow certification authorities the same level of confidence they now have with discrete,
Core #1 L1 I-Cache
Core #2 L2 Cache
L1 D-Cache
L1 I-Cache
L1 D-Cache
Coherency Module
Figure 2. Notional dual core layout with shared L2 cache multiprocessor designs using FPGA and ASIC components.
Shared memory controller certification issues Similar to the shared cache issue, multicore processors typically use a single memory controller for RAM. This RAM is then partitioned among the processor cores for run time operation. In many cases there may be the possibility of access being blocked depending on the operations of one core. This problem can be made more problematic based on the software running on each core especially if they are significantly different in functionality, such as in the case of a dual core processor with one core running a ARINC 653 partitioned operating system and the second core running a simple operating system using linear addressing space that implements an I/O offload engine. Even when running identical software loads, this interaction at the memory controller level may prevent the synchronization of the processors as is typically done today. Without the possibility of providing some hardware method of synchronizing the cores and their operation, this is then left to software which could introduce schedule skid and jitter resulting in non-deterministic behavior. Such behavior would not be certifiable unless large jitter margins were provided for in the requirements. Figure 3 shows a notional implementation of such a multicore processor using two cores.
Additional factors with multicore processors As discussed previously, use of multiple processors provides redundancy and reliability in avionics systems. A subtle factor that may not be apparent with multicore processors is the loss of redundancy in clock signal feeds and power distribution. With multiple, discreet processors it is possible to have separate and isolated power feeds to each processor in the system thus eliminating any potential for a single point failure causing catastrophic failure modes in the module. The same is true of clock source being fed from multiple, synchronized sources preventing single point failures. The same cannot be said of multicore designs that typically do not have such redundant feeds for power and clocks. This can lead to possible single point failures disabling an entire module even though the failure that occurred only should have affected a single core. This again is an issue of transparency in the internal design and layout of the multiple cores and the underlying substrate from which they derive their clock and power feeds. A final factor to consider is the inter-processor (or in this case inter-core) communications. While in most cases shared memory is easily provided with the hardware design of multicore chips, implementing a safe and robust communications mechanism over such shared memory is not as clear. This is especially true in the case of time and
Core #1 L1 I-Cache
Core #2 L2 Cache
L1 D-Cache
L1 I-Cache
L1 D-Cache
Coherency Module System Bus
SDRAM Controller
Figure 3 – Notional dual core processor with shared memory controller space partitioned system that uses the ARINC 653 specification[6] and hence ARINC ports for communications[5]. Care must be given to insure that the underlying shared memory implementation does not compromise the port communications is such a way as to invalidate its use. This can be easily seen when shared memory is not protected such that each core has equal write access permissions to the memory space.
account safety in the implementation of these devices in order to allow their usage and eventual certification for use in avionics systems even
Summary
[2] RTCA DO-178B “Software Considerations in Airborne Systems and Equipment Certification” www.rtca.org
While multicore processors offer a compelling attractiveness to system designers, especially in the area of reduced space, weight and power (SWaP) properties, it is clear that there are numerous issues related to the hardware design of the processors themselves as well as how they interact with software that could cause issues with eventual certification of both the hardware and software employing these devices. It is clear the semiconductor manufacturers must take into
though the aerospace industry comprises only a small segment of their market for multicore chips.
References [1] Vance Hilderman and Tony Baghai, 2007. “Avionics Certification”, Avionics Communication, Inc. Leesburg, VA. USA
[3] RTCA DO-254 “Design Assurance Guidelines for Airborne Electronic Hardware” www.rtca.org [4] Albert Helfrick, 2004. “Principles of Avionics”, Avionics Communication, Inc. Leesburg, VA. USA
[5] Cary R. Spitzer, 2007.“Avionics Development and Implementation”, CRC Press, Taylor and Francis Group, Boca Raton, FL. USA [6] Avionics Application Software Standard Interface Part 1 – Required Services (ARINC Specification 653-2), 2006. Aeronautical Radio, Inc., Annapolis, Maryland
Email Address mailto:
[email protected]
Conference Identification 28th Digital Avionics Systems Conference October 25-29, 2009