Government Open Binary Format of the Army Bulk Aviation Condition-Based Maintenance Files David Noever [1], Tien Vu [1], Dennis Dunaway [1], Michelle Moorefield [1], and Gail Cruce [2] [1] PeopleTec, Inc, 4901-I Corporate Drive NW, Huntsville, AL , 35805,
[email protected]; [2] US Army Aviation Engineering Directorate, Test & Integration/CBM Manager, Aviation Engineering Directorate (AED), RDMR-AE, Bldg, 4488, Redstone Arsenal, AL 35898-5000,
[email protected] A Government open binary file format has been developed and tested for generation, storage and translation of Army Bulk Condition-Based Maintenance (CBM) data. The format combines the best aspects of databases in a single compressed file. The processing benefits of binary file standards derive mainly from their inherent compactness and access speeds. The file size reduction from text to binary exceeds five- to ten-fold. For fetching a single table value, binary access speeds are 97 percent improved. Introduction Most Army data subscribers would prefer less verbose files that are easier to generate, parse and manipulate in code. At the 2012 DOD Sustainment Summit, 66% of those surveyed said vendors need to adopt open architecture and standards [1]. Most subscribers also want to maintain and extend existing software tools. Forty percent of those surveyed wanted license free analysis tools [1]. The central challenge of a new government open binary format is how to translate an abstract data standard into the best file type for the Army’s particular needs including optimized size, speed, and portability. Aerospace industry standards for binary data exist and are widely used, including many open- and closed-source formats. For example, the IRIG Chapter 10 format is an open aviation standard with large vendor and government support. The Army Bulk Condition-Based Maintenance (ABCD) file format is derived from a NASA Common Data Format (CDF) and conversion software code is available for translating text or database formats into CDF [2].
Figure 1 Classification Sets for Common Aviation Binary/Text Formats A subset of CDF was chosen for encoding parametric bus data from major vendors (Honeywell, Goodrich) and for major aircraft platforms (MH47G, AH64D, UH60M/A/L). The driving force behind creating this open binary format can be seen by example in the comparison table for file size and access speeds.
Figure 2 Typical File Size and Access Speed Comparison for Binary/Text Formats Common Data Format Features CDF combines some of the best features of single files with a database-like structure and random value access [3, 4]. For example entire columns can be accessed quickly without reading the file row-by-row. The CDF can be compressed using standard zip protocols either as an entire file or for selected variables. CDF datasets can be exported, edited in place for single cell values, and merged to a single table from multiple flights. Statistics for the number of entries, header types and file integrity (MD5 checksum) can be accessed quickly without reading the entire file. The advantages of a portable and open standard further include wider data sharing. Post-processing a CDF can be done in popular statistics applications or with custom API software. Existing CDF tools can import CDF files into common mathematical programs like MathWorks MATLAB, Interactive Data Language (IDL), Application Visualization System (AVS), FlexPro and Mathematica. Custom Software For writing and reading the government open binary format, converters for existing vendor file types—along with executables to convert CSV text tables—were written. The Honeywell bus data is encoded in binary format as *.XM or *.XMX files. The Goodrich parametric data is encoded in binary format as *.RDF or *.MUD. Application Programming Interfaces (API) exist for reading, writing and manipulating CDF files using all major software languages including C/C++, C#, Java, Perl, Python, and Fortran. To convert between vendor file formats and CDF, both Java and Perl interfaces were written. A particular development need was identified early to manipulate the MH47G Honeywell VAR files to produce open binary and post-process to Structural Usage Monitoring (SUMS) files. This task was prioritized because the Army needed faster processes for handling Regime Recognition (RRA) and Damage Fraction Calculations (DFC) than was currently possible with large text CSV files. Software releases for next generation Honeywell VMEP ground stations were tested with a pull-down menu option to export either text or government open binary files and compared to expected MH47G regime and damage fraction. Graphical User Interface (GUI) Demonstrations A series of tests were done on sample bus and parametric data from both the Honeywell VMEP and Goodrich RDF file types. Across many aircraft and vendor platforms, file sizes for the CDF were in the range of 2-3 times smaller than the original binary and 5 times smaller than the text Comma-Separated-Value file.
Figure 3 Typical File Size Comparison for Goodrich UH60M Binary/Text Formats Custom data mining tools were written to dump entire CDFs or selected columns to text CSV files for comparison. While a common analyst’s task, column-selection can be cumbersome in CSV files.
Figure 4 Example CSV Export of Selected Variables from CDF Custom strip charts were written to display user-selected variables in a graphical plot against time. The input file for charts is a binary CDF, either compressed or uncompressed. When the CDF is loaded, the software automatically validates the file integrity by displaying the checksum embedded in each file. Multiple charts can be displayed in the same window. Each chart is individually zoom-able, auto-scaled, and scroll-able. When the user moves the mouse over graphical points, a pop-up tool tip window shows underlying data values.
Figure 5 Strip Charts with Graphical Interface and CDF Input Files Common Data Format (CDF) is a promising file and tool set for broad use within the aviation community. The integration both with existing tools and new tools make CDF worth further examination and development. The description presented has outlined and demonstrated an ABCD-compliant format primarily for encoding aircraft bus data from various vendors. Custom readers were created to batch-process large numbers of aviation RDF and VAR files as inputs and convert the parametric data to both text CSV and binary CDF files for comparisons. Custom CDF writers were created to select, merge, edit and validate 1) individual cell values, 2) single and multiple columns and 3) single and multiple CDF files. After binary encoding, the data were displayed as strip-charts and repurposed for health maintenance applications including structural usage monitoring and built-in data integrity validation. An interface control document (ICD) [5] and Government Open Binary Format Draft Guide [6] have been authored to describe the requirements for the aviation community to employ this powerful software format and to deliver more robust data handling capabilities.
References and Notes 1.
2. 3.
4.
5.
6.
2011 Survey during DOD Maintenance and Sustainment Summit, Penn St Applied Research Laboratory brief to the 2012 DOD Maintenance and Sustainment Summit, 27 Feb 2012, “Review and Recommendations to Address the Primary Barriers and Challenges Facing DoD‐Wide Implementation of CBM+” “Common Data Format Specification”, NASA Goddard Space Flight Center, http://nssdc.gsfc.nasa.gov/cdf/html/FAQ.html Typical aviation text files offer a human-readable format, such as with comma-separated values in rows and columns. Such tables require a parser and a common delimiter. They provide no standard for determining the correct end-of-file during transmission or storage. They are highly-compressible when zipped into a binary format and thus include redundant information that slows down transmission. As described in Honeywell Document 1837-0002-ICD, Interface Control Document For the Exceedance Monitor File Converter, Rev 2, 4/13/2011: “The format for the header, and the frame formats ... allows a wide variety of export formats to be supported. Data would typically be exported in a binary format, as it greatly decreases file size and processing time.” Since typical CBM data involve different sampling rates, missing data are a key characteristic in a sparse matrix or table where time stamps represent the first column. The CDF file format treats missing data with configurable ‘pad values’. “Interface Control Document (ICD) for Conversion of Exceedance Monitor (XM) File to Government Open Binary Format (OBF) and Structural Usage Monitoring (SUMS) Inputs”, PeopleTec/U. S. Army Aviation Engineering Directorate, Redstone Arsenal, AL 35898-5260, AED-OBF, Rev 1.0, 3 Feb 2012 “Draft Guide for the Open Binary Format of the Army Bulk Aviation Condition-Based Maintenance Files”, PeopleTec/ U. S. Army Aviation Engineering Directorate, Redstone Arsenal, AL 35898-5260, AVN-OBF Rev 1.0, 21 Oct 2011.