Resources for Developing Algorithms and Processing Level 1A and Geolocation Information. Albert J. Fleig PITA, NASA Goddard Space Flight Center, Code 922, Greenbelt, MD 20771 301-6 14-5498 I Fax 30 1-614-5269 / afleig @ 1tpmail.gsfc.nasa.gov Jeffery J. Blanchette General Sciences Corporation, 7501 Forbes Blvd. Suite 103, Seabrook, MD 20706 301-352-21 12 / Fax 301-352-0143 /
[email protected] James A. Kuyper General Sciences Corporation, 7501 Forbes Blvd. Suite 103, Seabrook, MD 20706 301-352-2150 I Fax 301-352-0143 /
[email protected] John M. Seaton General Sciences Corporation, 7501 Forbes Blvd. Suite 103, Seabrook, MD 20706 30 1-352-2140 I Fax 30 1-352-0143 / seaton @ltpmail.gsfc.nasa.gov Robert E. Wolfe Raytheon ITSS, NASA Goddard Space Flight Center, Code 922, Greenbelt, MD 20771 301-614-5508 I Fax 301-614-5269 /
[email protected] Edward J. Masuoka NASA Goddard Space Flight Center, Code 922, Greenbelt, MD 20771 301-6 14+5515 I Fax 301-6 14-5269 I
[email protected] Abstract - Every satellite data system starts with some form of Level 1 and geolocation processing. This paper provides information to help future missions better estimate the personnel, schedule, and computing requirements for their data processing and algorithm development. The level of development effort and the computational resources for Level 1A and Geolocation processing for the MODIS instrument are described. This information is related to the requirements for the Level 1A and Geolocation tasks. Estimates of how both development and processing resource requirements scale to these functional requirements are provided. INTRODUCTION The MODIS Science Data Support Team (SDST) has developed algorithms to convert the raw input data (Level 0) into a form convenient for operational processing (Level 1A) and to precision geolocate the data. The Level 1A processing starts with two hour blocks of data delivered by the EOS Project's data acquisition system in the same format that left the instrument. The Level 1A processing checks the data, reformats it for processing convenience and segments the data into five minute granules. Geolocation processing combines information about the instrument with spacecraft attitude and ephemeris data, and other ancillary data and calculates the ground location of each MODIS observation. The geolocation task also includes developing a system to validate and quality control the geolocation by intercomparison of MODIS images with ground control points [I ,2]. LEVEL 1A PROCESSING
0-7803-6359-0/00/$10.000 2000 IEEE
MODIS orbits the earth at an altitude of 705 km in a near polar orbit with an inclination of 98.2", a mean period of 98.9 minutes, and a 16 day repeat cycle [3]. MODIS has a field of view of 110" and senses the entire equator every two days in 36 spectral bands: 29 with 1 km (at nadir) pixel dimensions, five with 500 m pixels and two with 250 m pixels. Full global coverage occurs daily above approximately 30" latitude. The MODIS instrument generates 10.6 Mbps in day mode and 3.2 Mbps in night mode and runs approximately half the time in each mode. This produces an instrument output of almost 75 gigabytes per day. In addition to radiances collected while viewing the earth the data stream also includes calibration packets from a solar diffuser, a black body, deep space, and a Spectroradiometer Calibration Assembly (SRCA). There are also over 750 discrete engineering data items stored in 60 sets with information about the instrument and ephemeris and attitude information about the spacecraft. Level 1A processing starts from CCSDS packets delivered in two hour data sets. It includes extensive testing of the input data to detect instrument operations mode, the time period covered, and missing and incomplete data. The radiance data is reformatted from packed 12 bit data into two byte (16 bit) values and the data is written out in HDF-EOS format in five minute granules. There are 3 sets of metadata associated with the L1A product. They are: ECS Granule Inventory Metadata (Coremetadata.0 for ECS insertion at the GDAAC), L1A Specific Granule Metadata (One set of metadata per output product, 14 pieces of data) and Scan-Level Metadata (data stored for each scan of the L1A output product, 14 pieces of data per scan). There are 288 granules per day each containing 203, or occasionally 204, scans of data. Each
2047
scan contains information for 10 rows of nominal 1 km pixels with missing and incorrect data flagged and filled. GEOLOCATION PROCESSING Geolocation processing combines information about the MODIS instrument obtained from the Level 1A data with external inputs for spacecraft attitude and ephemeris ancillary files with information about polar motion, leap seconds and land sea boundaries and digital elevation model inputs. It generates geodetic latitude and longitude, terrain height, satellite zenith angle, satellite azimuth, range to the satellite, solar zenith angle, and solar azimuth for each 1 km ground pixel (almost 10,000 times per second) [4]. The geolocation output also includes extensive metadata to facilitate automated data selection for subsequent processing, provide quality assurance information and support archival and retrieval of data from the Earth Observing System’s Distributed Active Archive Centers.
DEVELOPMENT STAFF SIZE The MODIS Level 1A and Geolocation software development started in December of 1995 and is still underway. Over this 5.5 year time period there has been an average of 2.25 person years per year (12.25 person years total to date) on the Level 1A software, 2.25 person years per year (12.25 person years total) on the geolocation software, and for the last three years there has been an additional 2.75 person years per year (8.25 years total) on the quality assurance, island and ground control point development. This does not include the effort on these programs by ancillary groups responsible for configuration control, software integration and test, publications, people borrowed for code review and walkthroughs, or the effort to develop synthetic test data sets. In preparing this paper we have had several discussions of what features of the job consumed the most time. Although we did not keep extensive time records by function there was substantial and surprising consensus on this. Design and coding of the basic algorithms, with one exception discussed below, did not take a majority of the time. Understanding the meta-data requirements, both those imposed by the Earth Observing System Data Model and those needed to support production, quality assurance and data set retrieval are estimated to have taken over 25% of the time. The EOS Project is attempting to provide the science community with a substantial leap in ease of finding the data they need for science use. In addition to the inherent complexity of providing the metadata to support this there were several factors that made the metadata task more difficult. These included developing the software in parallel with developing
the metadata model, an excessively rigid set of metadata formatting requirements, and lack of any way to test the metadata until late in the program. MODIS and the other instruments launched on the EOS Terra spacecraft, working with the EOS Project, the DAACs and the contractor developing the processing system has worked out many of the kinks in this system and some of these problems will not have as large an effect on subsequent users of the EOS Data and Information System. There are a number of interfaces involved in this software. Information about the exact nature of the interfaces was generally available but not always correct or clearly documented. Extensive unit testing was required to assure that subtle timing and indexing errors were detected and removed. A substantial portion of the design effort went into anticipating exception cases and developing a processing algorithm that would be robust in the face of any possible problem with the input data. There were two primary concerns here. The first was to assure that we did not lose any valid data. If there is any good data in the scan, even if the rest is garbage, L1A is designed and built to put the good data in the output product and discard the bad data. The second goal was to write this progyam so that it could run reliably in an automatic production system. This meant that the program had to proceed corrlectly no mater what was wrong with the input data and never fail and return an error code. (This means that if there are data problems the program identifies them, generates appropriate messages, and continues processing) Once the Level 1A code was nominally complete and had passed all of its unit tests almost a complete year was spent devising what if scenarios, building special test data sets and trying to find additional exception cases. This was time well spent as the code has run with great success ever since launch. However, despite this effort we found that there were still unexpected problems after launch. The spacecraft system inserts random bit flips in a non-error-corrected portion of the telemetry stream. This meant that the data as received was not always correctly time ordered, and the packet size infctrmation was not always correct. The ground data acquisition group had a different interpretation of the boundary between two hour data sets than we did (our fault) And the production rules as implemented in the processing facility did not deliver all of the advertised functionality. Altogether we have spent a person year validating systems operation and correcting these things since actual on-orbit data became available. The requirement for precision geolocation necessitated a precision correction for the terrain height of each ground pixel. For off nadir processing this included checking to see if any neighboring pixel was enough higher than the nominal ground point to intercept the nominal line of sight. Getting this algorithm to work correctly and optimizing the programming to reduce the time spent in this step took a
2048
substantial effort. We spent approximately a half person year supporting the development of the EOS digital elevation model and developing the ground control point chips used for the geolocation correction.
first is the geolocation accuracy requirement which, for MODIS, is to locate each pixel with an error less than 0.1 of the nominal 1 km pixel dimension. This accuracy requirement drives the complexity of the physical model that must be constructed for geolocation. Use of a point specific digital Over the course of the development effort we made six elevation model consumes about 20% of the MODIS separate deliveries to the processing center including a geolocation time. For a nadir instrument or one with preliminary version, a launch version, and four incremental geolocation accuracy requirements of two or more kilometers improvement versions (there were also several minor software this could be eliminated. This would also reduce the patches). Approximately three months of effort (1.5 work development time by perhaps a year for geolocation and years total) was associated with packaging and functional might eliminate the need for a control point comparison testing for each of these major deliveries. This software effort program. A second consideration is the complexity of the was conducted with a relatively high level of attention to formal software development processes. The paperwork, instrument. Scanning instruments with moving parts reviews and documentation associated with this took a introduce more complexity into the processing algorithm than substantial amount of time. For instance to open a single fixed line of sight instruments. However this effect would module of code and make a simple change typically took two not be very large in the overall effort. The third factor is the weeks of effort by the time all of the process requirements number of ground elements that must be located. As noted were satisfied. While this initially seems excessive we above MODIS locates almost 10,000 pixels per second. This believe that the adherence to process was part of the reason introduced a requirement for substantial optimization of the that we have had robust well documented code, delivered on processing code to keep computing resources within limits and probably added about 0.5 person years to our effort. time, and functioning well right from the time of launch. Geolocation processing requirements should scale almost CODE SIZE AND PROCESSING RESOURCES linearly with the number of pixels for another instrument. The overall processing system that the algorithms run in The Level 1A program contains 96 routines and has 5075 can also effect programming effort. The MODIS software lines of executable C code and 12,875 comment lines. The was developed to run in a very large, complex, automated Geolocation program has 58 routines, 5629 lines of production system. The system it was to run in was being executable C code and 10,989 comment lines. The control developed at the same time and the interfaces between point routine has 55 routines and 5750 lines of C code (not processing system and processing algorithm were being including code reused from the geolocation program) and defined in parallel with the algorithm development effort. 13,188 lines of comments. We have stringent requirements Both the size and complexity of the system and the parallel for in-line documentation. Both Level 1A and Geolocation development added to the effort required. We have only a run in approximately real time in the Goddard DAAC (i.e. general feeling for the impact of these two factors and would they each take 5 minutes to do 5 minutes worth of data when guess that the size and complexity effect was on the order of 15% and the parallel development effect was perhaps 20%. running on an SGI 195 MHz processor). REFERENCES
SCALING THIS TO ANOTHER INSTRUMENT How would this information scale to another instrument? We expect Level 1A processing rate to scale linearly with input data rate. There is nothing atypical for MODIS that should make the Level 1A processing take unusually longer or shorter than another instrument as long as the meaning of Level 1A is consistent across both instruments. However the staffing effort to write the Level 1A program is almost independent of the data rate and is instead driven by the need to anticipate, check for, and correct all possible errors. One clear lesson of this development has been that, no matter what the Project says will be tested or guaranteed by predecessor actions on the instrument, spacecraft, or telemetry system, it is necessary to check everything! The situation is less clear for Geolocation. There are three things that drive the computing requirement for MODIS. The
R. E. Wolfe, M. Nishihama, A. J. Fleig, J. M. Unger, and D. P. Roy, “MODIS Operational Geolocation Error Analysis and Reduction Early Results. IEEE International Geosci. And Remote Sensing Symposium (IGARSS 2000) (This publication) [2] M. Nisihama, R. Wolfe, A. Fleig, J. Blanchette, J. Kuyper, “MODIS Geolocation Algorithm and Error Analysis Tools”, IEEE IGARSS 2000 [3] - V. Solomonson, W. Barnes, P. Maymon, H. Montgomery, H. Ostrow, “MODIS: Advanced Facility Instrument for Studies of the Earth as a System”, IEEE Trans. Geosci. and Remote Sensing, 27:145-153, 1989. [4] M. Nishihama, et al., MODIS Level IA Earth Location: Algorithm Theoretical Basis document, Lab. Terrestrial Phys., NASA Goddard Space Flight Center, Greenbelt, MD, SDST-092, Aug. 26, 1997.
[l]
2049