Applying Methods of Soft Computing to Space Link Quality Prediction Bastian Preindl and Lars Mehnen and Frank Rattay and Jens Dalsgaard Nielsen
Abstract The development of nano- and picosatellites for educational and scientific purposes becomes more and more popular. As these satellites are very small, high-integrated devices and are therefore not equipped with high-gain antennas, data transmission between ground and satellite is vulnerable to several ascendancies in both directions. Another handicap is the lower earth orbit wherein the satellites are usually located as it keeps the communication time frame very short. To counter these disadvantages, ground station networks have been established. One input size for optimal scheduling of timeframes for the communication between a ground station and a satellite is the predicted quality of the satellite links. This paper introduces a satellite link quality prediction approach based on machine learning.
1 Background Within the last decade the educational and academic approaches in space science made huge steps forward. Driven by the development of small satellites for taking scientific or educational payload of any kind into space, universities all over the
Bastian Preindl Institute of Analysis and Scientific Computing, Vienna Technical University, Austria, e-mail:
[email protected] Lars Mehnen Institute of Analysis and Scientific Computing, Vienna Technical University, Austria, e-mail:
[email protected] Frank Rattay Institute of Analysis and Scientific Computing, Vienna Technical University, Austria, e-mail:
[email protected] Jens Dalsgaard Nielsen Department of Electronic Systems, Aalborg University, Denmark e-mail:
[email protected]
world started to design, develop and launch small satellite projects based on the Cubesat standard [1] [2]. Small satellites often operate in the low earth orbit (LEO), which leads to a very high orbit frequency. As a consequence the communication timeframe between satellite and affiliated ground station tends to be about 30 minutes a day whereas the ground station is idle for the remaining time [3]. As the available time for communication between satellite and mission control can be crucial for a mission, investigations have to take place to optimize the usage of ground stations and to significantly extend the time a satellite can communicate with the mission control center. A sophisticated approach for a world-wide interconnection of independent ground stations [4] is the Global Educational Network for Satellite Operations (GENSO) [5]. Its aim is to share ground station capacity to different mission controls by creating a hybrid, supervised peer-to-peer network using the Internet. Interconnecting a very large amount of satellite ground stations will form the base for novel scientific approaches in the domain of link quality determination and prediction, overall optimization of space up- and downlinks as well as hardware utilization and the influence of environmental conditions on space communication. This will be a major step forward for research as well as for educational purposes.
2 Specific Optimization Aims The Global Educational Network for Satellite Operations constitutes the scientific base for a multitude of research fields and investigations. The research aims on the identification and utilization of link quality information. For the first time in history the possibility is provided to gain focussed information about the quality of LEO satellite downlinks from a huge amount of independent ground stations throughout the world. The gained information can be processed and applied in various ways, whereas the optimization of the network itself is only one of them. The research aims rely on a broad base on the GENSO project as it delivers the needed raw data for the majority of investigations and offers the possibilities for applying the discovered novel models and algorithms in a productive and mature system. It also provides an unprecedented availability of ground stations which makes such high sophisticated investigations even possible. The majority of the research results and derived models will directly flow back into the project as implementations forming sophisticated cornerstones of GENSO. In case of soft-computing approaches a continuous learning, refinement and adaption process takes place as the resulting model is dynamic and self-optimizing. Each of these specific aims constitutes a novel approach in its field of science and can support space operations upcoming the next decade and probably far beyond. All of the topics during research are aimed on focus on the reduction of resource and energy consumption to enhance the outcome of space missions drastically.
2.1 Space Up- and Downlink Quality Identification and Determination As a base for all further scientific approaches in this direction an overall metric has to be identified for measuring and comparing the quality of satellite links. This is due to the application of different protocols, modulations and frequency bands on one hand and the operation in a heterogeneous hardware environment on the other hand. Figure 1 illustrates the complexity of communication between ground and space.
Fig. 1 The communication signal path between a spacecraft and a ground station
As the network itself is not able and also not permitted to be aware of the content of the transferred data between a mission control center and a spacecraft a method has to be identified to measure the quality in a completely passive way. The investigations include a possible design for the determination of uplink quality for future application.
2.2 Identification of Correlations Between Environmental Conditions and Space Link Quality The collected information about satellite link quality can be set into relation with environmental data to identify correlations between environmental conditions and their impact on link quality. The ground station network can therefore be utilized as very large distributed sensor cluster. Environmental variables which will be taken into account cover: • Space weather: Solar bursts, ion storms and similar space weather conditions do have a heavy impact on space communication links. Large ground-bounded sensor clusters and satellite missions for measuring space weather provide detailed information about the current situation in space. • Earth weather: Rain, humidity, snow and other well-known earth weather conditions also do affect radio communication. Not only the weather on ground but also the conditions within the stratosphere, mesosphere and ionosphere have to be taken into account.
• Atmospheric effects : Atmospheric gases in different layers can have impact on radio links whereas higher frequencies are more susceptible. Effects like ice crystals in higher atmospheric layers are taken into account. • Geographical circumstances: The positions of sun and moon play an important role amongst other geographical and temporal circumstances in the condition of radio links. Therefore their relative positions have to be taken into account. Actually not being an environmental variable but having perhaps the largest influence on link quality are communication parameters like carrier frequency, bandwidth, applied modulation and encodings, filters and many others. The resulting calculation model establishes the base for further investigations on environmental impacts on one hand and quality predictions on the other hand.
2.3 The Long-term Impact of Climate Changes on Short-range Satellite Communication Based on the derived model of dependencies between environmental conditions and the quality of satellite links the focus is set on the impact of specific magnitudes as they play a possible role within the global climate change. The different impacts on the different radio frequency bands, modulations and encodings are investigated and a long-term impact prediction on satellite communication based on current prognoses is developed. The outcoming result is supposed to be a significant factor in future spacecraft and ground segment design as a possible outcome of the investigations identifies susceptible communication variables in respect of the global climate change. This also has to be taken into consideration when designing commercial satellites which stay operational in orbit for a decade or more.
2.4 The Rapid Determination of Spacecraft Orbital Elements When new spacecrafts are exposed into their designated orbit by a launch vehicle, their exact position and even the orbit is not known. It can take up to several days until institutions like the North American Aerospace Defense Command (NORAD) have clearly identified the new spacecraft in orbit and provide the exact orbital elements (the Keplerian Elements) defining the position and attenuation of an object in space. The worst situation is that no communication can be established to the spacecraft during that time. The first hours and days in orbit are the most important ones since most problems occur during that time [6]. A significant amount of space mission failures have been a consequence of no communication possibilities between ground station and spacecraft causing enormous losses of investment, both time and resources.
Taking these circumstances and the fact that institutions like NORAD are under governmental control into consideration points to the need for an independent, reliable process for rapid orbit determination. An algorithm shall be proposed, modeled, simulated and approved for the automated orbit determination by utilizing the ground stations participating in GENSO and the model derived from former quality considerations. The algorithm is intended to reduce the time for gaining a precise orbit from several days to a couple of hours and therefore significantly support future space mission.
2.5 The Short-term Prediction of Space Communication Link Quality Based on the aggregated information of satellite link quality and current environmental conditions a short-term prediction model will be designed. Not only the useful booking of ground stations is going to be optimized but also the probability of successful space links is dramatically raised. This raises the amount of retrievable satellite data during nominal operation significantly and can even play a major role in the success of a whole mission in critical situations wherein time is the most important factor.
2.6 Automated Identification of Imprecise Ground Stations By reversing the model for short-term prediction of the link quality and investigating on differences between the predicted and effective link quality after the pass of a spacecraft has taken place, ground stations with receiving and transmitting capabilities below the predicted level can be identified. The ground station operators are informed about problems with their communication hardware which offers the possibility of having a low-cost calibration facility for non-commercial ground stations. In parallel the ground stations are downgraded within the network to avoid the use of broken or imprecise ground stations in critical situations.
2.7 Automated Determination of Spacecraft Health and Orbit Changes Based on the model for prediction of the satellite link quality anomalies in a spacecraft’s behavior and its health status can be (indirectly) identified and countermeasured. Not only signal weaknesses and anomalies can be identified but also orbit deviations. The orbit information can be automatically re-adjusted by apply-
ing a prediction algorithm as for the rapid orbit determination after a spacecraft launch.
3 Data Mining Architecture The machine learning pipeline has to consist of various subsystems in order to form a comprehensive environment for the selection and application of modern AI classifiers to achieve all of the projected aims. The subsystems obtain and pre-process all possible variables which could influence communication links on the one hand and provide the prediction results as input to various decision making problems and for error recognition on the other hand. Figure 2 visualizes the order and interdependencies of the specific subsystems.
Fig. 2 The components of the machine learning pipeline
The purposes of the specific subsystems are: • GENSO database: Information about the involved hardware (spacecraft and ground station) is a prerequisite for precise data normalization. The network provides this information to participating ground stations and mission controls for scheduling and communication purposes. The feature vector for each classification is fed with hardware and communication details using the provided interfaces to the GENSO database. • Satellite pass quality information: The quality of a satellite pass is measured at the communication ground station, normalized and delivered to the core server, where it is added to the feature vector for learning. These measurements constitute one of the cornerstones of the prediction model derivation. As the satellite link quality is the feature to be classified, it is only provided for compiling training sets, test sets and for model calibration. • Environmental influences, weather, and space radiation information: Current data about the environmental conditions during a measured pass has to be
•
•
•
•
•
collected utilizing web mining technologies. The collected environmental data forms another cornerstone of the prediction model derivation. Orbiting network control satellite feedback: A novel approach for non-commercial satellites having a very high impact on nowadays and future space missions is the first orbital space link measurement instrument, a satellite whose main purpose is to send and receive predefined test data to actively measure a space link’s quality. The resulting information about bit error rates (BER), the most significant quality metric of a digital communication link in both sending and receiving direction, is highly accurate and outperforms any passive link determination by far. As soon as the link quality information from the satellite(s) is available to the GENSO network it will complement and partially substitute the passively collected quality information from the ground stations. Central data aggregation and normalization: The data collected from several sources needs to be aggregated and normalized. Heterogeneous hardware environments have to be taken into account just as well as satellite orbits and their relative position to measuring ground stations, amongst several other particularities like daytime, solar and lunar eclipses etc. The outcome of this process is a high amount of data properly preprocessed for the application of different model derivation methods. It therefore forms the base for training and testing different classifiers. Rapid orbit (TLE) determination: Based on the normalized data retrieved primarily from local ground station measurements an algorithm can be identified and approved by simulation to optimize the determination of spacecraft orbits. Model derivation based on machine learning methods: Having the aggregated and normalized data as foundation, classification of former learning features are applied to determine a model for attribute interdependencies. The derived model is continuously calibrated during the operation of the network. Ground station calibration, spacecraft pass quality prediction and spacecraft scintillation and power status determination: Conclusions on these subjects can be drawn from the measured pass quality after a pass has taken place by selecting other features than the pass quality as classification attribute.
The primary value to be predicted and therefore the classification value is the link quality itself, as it is returned from the ground station network as satellite pass quality information. The definition of link quality is much more complex as expected and further investigations have been undertaken by the authors to identify a comparable and at the same time expressing size for satellite link quality in [7]. The satellite pass quality information as it is derived from the network delivers raw signal strength meter readings from the ground station radio hardware as a function in time, whereas noise and signal strength is not separated. The reason for not choosing a more explicit value for representing the link quality in the first step is that e.g. the bit error rate (BER) requires knowledge about the transmitted bit sequence in advance for a bit-for-bit comparison if no sophisticated forward error correction (FEC) is implemented in the communication protocol as for example in DVB-S [8]. In case of the most common protocol in non-commercial
satellite communication, AX.25 [9], only a simple cyclic redundancy checksum algorithm is applied (CCITT-16) and therefore no detailed bit error rate information can be obtained. Amongst this the payload transmitted and received by ground stations is also random to a certain extend (except for protocol headers) what makes a bit-for-bit comparison impossible. To gain a numerical value as input size for the classification algorithm, data aggregation and normalization has to be applied on the sequence of data representing the signal strength at a specific time. The requested numerical value is the BER, which has the big advantage that it is comparable with the data provided by the orbiting network control satellite [10]. Together with the orbital elements which precisely describe an object’s position in space also the location of the ground station is derived from the network’s core database amongst several other network and ground station specific parameters. These parameters can be used to set the measured link quality in relation to the satellite’s elevation and therefor its distance to the ground station. [11] has proven that the bit error rate is directly interconnected to the satellite elevation. As a consequence the derived BER has to be normalized with respect to the distance of the spacecraft. The myriad of values collectable about ground weather, space weather, environmental conditions and many other, continuously changing, influences on space links has to be collected, normalized and set in a relation with the transformed satellite link quality and the network information to build the foundation for the application of machine learning classifiers.
4 Prediction Model Development Inspecting the data types of the input parameters, their amount and dimensions and the data type of the classification results allows a preliminary prediction of which kind of learning algorithm will deliver the best quality on the test sets. As classified training data is provided, supervised learning is applied. Supervised learning generates a global model that maps the input feature vector to classifications. After parameter adjustment and learning, the performance of the algorithm will be measured on a real data test set that is separated from the training set and by later on integrating the classification model into GENSO. The feature vector consists of mainly numerical values, for example equivalent isotropically radiated power (EIRP), current spacecraft tumble rate, antenna gain, frequency band, baud rate, angle between antenna pointing direction and sun, moon and horizon, distance to spacecraft, solar wind activity, humidity and temperature on ground, atmospheric ion gas concentration, longitude and latitude, air pressure, and several more. The prime classification attribute, the link quality, is also numerical, but will be at first divided into quality classes. When more precise BER information, most likely from the measurement satellite, is available, the quality will be predicted as a continuous value using regression. This requires a machine learning algorithm
capable of dealing with numerical classes respectively regression. The authors expect Support Vector Machines respectively Support Vector Regression to deliver a very good classification performance [12]. For training and testing the machine learning algorithm collection and workbench RapidMiner (formerly YALE) [13] will be utilized. The different classifier performance will be evaluated using 10-fold cross-validation. The size of the provided training data and validation test set depends on the amount of ground stations participating in GENSO and being able to evaluate the link quality. Every satellite pass constitutes one attribute vector. A LEO satellite passes a ground stations horizon 6-8 times a day [3] which results in 6-8 pass reports per ground station per satellite per day. In its public Beta phase GENSO is expected to interconnect approximately 30 ground stations and track 15 satellites which results in more than 3000 data sets collected each day. Hence the network will provide an enormous amount of satellite pass reports and a more than adequate amount for training and validation in the first 24 hours of operation.
5 Conclusion Applying methods of artificial intelligence and non-linear optimization techniques to scheduling input parameters of a highly dynamic distributed cluster of satellite ground stations can lead to a significant increase in mission return from all spacecrafts in non-geostationary orbit. Student space projects, non-profit communities like the radio amateurs and probably even commercial space missions will strongly benefit from this work in various aspects: higher reliability, improved resource utilization and the establishment of quality assurance for satellite space links.
References 1. A. Toorian, K. Diaz, and S. Lee. The cubesat approach to space access. In Aerospace Conference, 2008 IEEE. NASA Jet Propulsion Lab., Pasadena, CA;, 2008. 2. Bryan Klofas, Jason Anderson, and Kyle Leveque. A Survey of CubeSat Communication Systems. Technical report, California Polytechnic State University and SRI International, 2008. 3. Shkelzen Cakaj, Werner Keim, and Kreˇsimir Malari´c. Communications Duration with Low Earth Orbiting Satellites. In Proceedings of the 4th IASTED International Conference on Antennas, Radar and Wave Propagation, 2007. 4. Tarun S. Tuli, Nathan G. Orr, and Robert E. Zee. Low Cost Ground Station Design for Nanosatellite Missions. In AMSAT Symposium 2006, San Francisco, 2006. 5. Bastian Preindl, Helen Page, and Viktor Nikolaidis. GENSO: The Global Educational Network for Satellite Operations. In Proceedings of the 59th International Astronautical Conference, Glasgow, UK. International Astronautical Federation, 2008.
6. S. Chouraqui, M. Bekhti, and C.I. Underwood. Satellite orbit determination and power subsystem design. In Proceedings of 2003 IEEE International Geoscience and Remote Sensing Symposium, IGARSS ’03, volume 7, pages 4590–4592 vol.7, 2003. 7. Bastian Preindl, Lars Mehnen, and Jens Dalsgaard Nielsen. Measuring satellite link quality in ground station networks with heterogenous hardware environments. Technical report, Vienna Technical University and Aalborg University, Denmark, 2008. 8. Radu Arsinte. Effective Methods to Analyze Satellite Link Quality Using the Build-in Features of the DVB-S Card. Acta Technica Napocensis -Electronics and Telecommunications, 47(1):33–36, 2006. 9. R.R. Parry. AX.25 [Data Link Layer Protocol For Packet Radio Networks]. Potentials, IEEE, 16(3):14–16, 1997. 10. Bastian Preindl, Lars Mehnen, and Jens Dalsgaard Nielsen. Design of a Small Satellite for Controlling a Ground Station Network. Technical report, Vienna Technical University, Austria and Aalborg University, Denmark, 2008. 11. Cheng-Ying Yang and Kuo-Hsiung Tseng. Error rate prediction of the low Earth orbit (LEO) satellite channel. Communications, ICC 2000. IEEE International Conference on, 1:465–469 vol.1, 2000. 12. S. Hussain and V. Khamisani. Using Support Vector Machines for Numerical Prediction. Multitopic Conference, 2007. INMIC 2007. IEEE International, pages 1–5, Dec. 2007. 13. Ingo Mierswa, Michael Wurst, Ralf Klinkenberg, Martin Scholz, and Timm Euler. YALE: Rapid Prototyping for Complex Data Mining Tasks. In Lyle Ungar, Mark Craven, Dimitrios Gunopulos, and Tina Eliassi-Rad, editors, KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 935–940, New York, NY, USA, August 2006. ACM.