supported by NASA and NSF grants. The author is with ...... space assets,â NSF SBIR proposal Phase I final report, DMI-9461857,. 1995. [21] âThe national space ...
1944
IEEE TRANSACTIONS ON PLASMA SCIENCE, VOL. 28, NO. 6, DECEMBER 2000
System Identification, Modeling, and Prediction for Space Weather Environments Dimitris Vassiliadis
Abstract—By now nonlinear dynamical models and neural networks have been used to predict and model a wide variety of space weather environments. This review starts with the physical basis for and a brief description of the system approach. Following that, several examples illustrate practical issues in temporal and spatiotemporal prediction and modeling. The concluding remarks discuss the future developments in this research direction. Index Terms—Extraterrestrial exploration, Extraterrestrial phenomena, identification, modeling, neural network applications, nonlinear filters, space vehicle reliability.
I. INTRODUCTION
T
HE EFFECTS of the space environment on humans and machines have been studied from the inception of space exploration. As human presence in space is in an explosive phase, it is expected that the impact of these effects will be quite significant in this and the next few solar cycles. At the same time, the modeling effort, size of space databases, and availability of solar and interplanetary monitors have reached the necessary level to make accurate forecasting possible. These developments have prompted the establishment of national space weather programs in the U.S. [21], [22] and other countries. This paper discusses the capabilities of system analysis in space weather modeling and forecasting by reviewing past research and suggesting points of development for the future. Space weather disturbances are divided into two main categories according to their physical causes: first there are disturbances related to the high-latitude electrodynamic circuit which affect technological systems on the earth’s surface. Among these current systems, the most relevant ones to space weather are the auroral electrojets flowing in the ionosphere; other current systems, however, that complete the electric circuit, are also important, e.g., the field-aligned and polar cur, produced rents. The variation of the net magnetic field, by these sources, draws secondary “geomagnetically-induced” currents in large-scale conductors such as power grid elements and pipelines [Kappenman, this issue]. The second category of disturbances are caused by energetic electrons which are injected into the inner magnetosphere, especially during magnetic storms, and subsequently trapped in the radiation belts at altitudes of 3–5 R [2]. The intense electron flux produces malfunctions and failures in satellites. System analysis methods have been applied to both categories of space weather effects. Manuscript received January 19, 2000; revised May 8, 2000. This work was supported by NASA and NSF grants. The author is with Universities Space Research Association, NASA/Goddard Space Flight Center, Greenbelt, MD 20771 USA. Publisher Item Identifier S 0093-3813(00)11639-X.
Before discussing the various methods it is useful to start with some definitions. The majority of space environment models are empirical since they are derived from sets of observational data. Some of these will eventually be phased out and replaced by physical models which are based on first principles; the complexity of the space environment, however, essentially guarantees the predominance of the empirical component. In practice space weather applications combine the best available components of either approach. A prediction is the output of such a space environment model for a time interval of activity. More specifically, a forecast is a prediction for a time in the future (which has not occurred yet at the time the prediction is made). As a limiting case, a model can be used to nowcast, or describe the present, rather than the future, state of the space environment. In addition, a significant part of model development and testing is done through retrospective analyses. A forecast’s most important properties are its timeliness and accuracy. A minimum lead time is crucial for operators to respond to space weather emergencies. For a typical prediction based on solar libration point upstream of wind measurements made at the the earth, the lead time is basically the solar wind propagation time to reach earth, or 0.5–1 h depending on wind speed. The lead time can be extended either by predicting the solar wind variations (the increase is up to 1–2 h), still largely an unexplored area; or, in some cases, by using information directly from the solar surface (several days). Once issued, the forecast is verified against observations, and it is evaluated through a comparison with a reference model [7]. During model development, it is standard practice to divide the observations into a data set for training, or system estimation, and one or more test sets used to measure prediction accuracy. A. Physical Basis for the Systems Approach to Space Weather Many empirical models for space disturbances follow a systems approach: they represent the observed activity by a number of variables (usually derived from the observational data) whose coupling is described by (nonlinear) ordinary differential equations. It is important to address the question why such models, including the systems approach, are successful. First we discuss the physical basis for the systems approach, and then give its basic features. 1) It is an observational fact for many large-scale space plasma environments that their activity can be reproduced by a small number of variables. (Here, by large scale, we mean scales which are much larger than particle scales such as the Debye length or the gyroradius, and significantly larger than most fluid scales.) Most of the available degrees of freedom in those environments synchronize so that only a small number
0093–3813/00$10.00 © 2000 IEEE
VASSILIADIS: SYSTEM IDENTIFICATION, MODELING, AND PREDICTION FOR SPACE WEATHER ENVIRONMENTS
of collective motions is excited. Typically, as the activity level increases, characteristic modes become more apparent and dominant. Several space weather systems show this strong coherence between widely separated regions. (For the coherent dynamics of the inner magnetosphere, see [15].) Evidence for this low effective dimension is given in correlation analyzes (e.g., for correlation dimension analysis of high-latitude geomagnetic activity, see [25]). The self-organization may be spontaneous, due to the existence of a conserved quantity such as the magnetic helicity that organizes the plasma medium [11]. Alternatively it may be induced by a strong external forcing, such as the magnetospheric organization produced as a response to the solar wind mass, momentum, and energy input [17]. In either scenario, it is this self-organization that allows the possibility to model these plasma environments as low-dimensional dynamical systems. For a system that meets the above condition, it is possible to organize the data in a state space. A multitude of modeling and prediction methods have been developed for linear [30] and nonlinear systems [5], [1]. These approaches are combined with the physics of the environment in question and the needs of the user community. 2) The time evolution of many of these space environments is often deterministic and nonchaotic: starting from approximately the same initial conditions and with the same time-dependent inputs from adjacent plasmas, two incarnations of a space plasma system will initially produce the same response. Unless the system is chaotic, the responses will diverge slowly (linearly), or not at all (i.e, different initial conditions will converge to the same time evolution). Here we consider initial conditions that are identical within the level for fluctuations in the system. Also we compare the time evolution of the two systems over a time scale related to the typical large-scale dynamics of the system. For example, the geomagnetic response associated with the symmetric ring current has an average amplitude (convention) which is well ally represented by the geomagnetic index predicted with a simple dynamical system given the sequence of solar wind input parameters [4], (see Section II below). The relevant time scale is many hours to several days. In a second case, the amplitude and orientation of the interplanetary magnetic field determines the size and orientation of the large-scale ionospheric electric field pattern. The characteristic archetype patterns are classified and predicted by neural networks [28]. These and many other large-scale space environments show a deterministic, nonchaotic response, i.e., they effectively show little sensitivity to initial conditions in the long-term behavior. Local and short-term details, however, may and often do vary “chaotically.” We will not discuss this sensitivity to initial conditions further. For examples of measuring the sensitivity of models to initial conditions, a rather important aspect in model testing, see [37]–[39] for high-latitude and mid-latitude models, respectively; for a discussion of the magnetospheric physics involved in this sensitivity, see [27]. B. State Space Construction Since the system modeling and predictions are derived from data, it is necessary, especially for the purpose of prediction,
1945
Fig. 1. Schematic representation of a state space constructed from space weather data (e.g., solar wind and geomagnetic time series). Observed time series become flow lines (blue lines). By selecting a point in the state space as the initial condition one can predict the future activity (red line) by interpolating the flows starting from its nearest neighbor data points (enclosed by a sphere).
to have a database of the environment, which contains sufficiently many and varied activity intervals to represent the range of dynamical behaviors. This dynamics is modeled by a small number of variables ( 10), which are functions of the observed data and are summarily written as a state vector. If the data are in time series form, usual choices are lags (delays) of the form where is a characteristic time scale of the system, or derivatives. A more general choice is principal component analysis (see singular value decomposition in [26], and references therein). The physical interpretation of the observed variables is used as a guide for the choice of the model parameters and an initial estimate of their range of values. Multivariate data should be normalized (for each time series separately), usually to unit standard deviation, unless there is a physical scaling that can be used for normalization. Depending on the environment, scales of the proper dynamic range should be chosen for each variable, e.g., modeling of inner magnetosphere particle populations is best done in terms of a slowly varying function of the flux such as the logarithm. The state space, shown schematically in Fig. 1, is populated with points constructed from the training set (database) and recurrent features in the time series appear as well-defined structures of the phase space trajectories. Given an initial condition for the space environment, one can use a local model (based on the points closest to the initial condition in the state space), or a global approximation (e.g., a neural network) for prediction. In order to reach the best choice of model parameters, a usual tactic is to optimize prediction, i.e., minimize forecast error for several test sets. II. PREDICTIONS OF THE DISTURBANCE AMPLITUDE: INDICES The simplest space environment amenable to both systems analysis and physics is the geomagnetic effect of the ring curhas been designed to mearent. The geomagnetic index sure the geomagnetic effect of the symmetric part of the current [16]. The current arises from the drift of low-energy particles trapped in the radiation belts and it increases during magnetic
1946
IEEE TRANSACTIONS ON PLASMA SCIENCE, VOL. 28, NO. 6, DECEMBER 2000
(a)
(b)
(c)
D
Fig. 2. (a) Two-dimensional phase space of the index. The four quadrants are identified with the four phases of the magnetic storm. (b) Extension to a state , here the electric field component V B . (c) Impulse responses for the four storm phases in the order they occur space which includes the solar wind input E (from top to bottom). The heavy line, superposed on all graphs, is the average for all phases [linear prediction filter, (2.2)].
storms. Therefore, is a rough measure of mid-latitude, and even global, magnetospheric activity level. The index is compiled from measurements of ground magnetometers at 1-h resolution (although to monitor dynamic changes within the current a much higher resolution is necessary). The relevant physical variables are the east–west component of the solar wind electric , characterizing the large-scale magnetic-field input field, rate into the magnetosphere; and the rate dDst/dt, often referred to as “ring current injection rate.” In fact much of the large-scale
dynamics of the order equation [4]
amplitude can be reproduced by a first-
(2.1) where is the decay time of the ring current ( 7 h) and is the effective coupling to the solar wind. In the state space constructed for the ring current dynamics from these key variables (Fig. 2), several regions are directly identifiable with the storm
VASSILIADIS: SYSTEM IDENTIFICATION, MODELING, AND PREDICTION FOR SPACE WEATHER ENVIRONMENTS
1947
Fig. 3. Prediction of AL index (top, gray line) from nonlinear ARMA model [similar to (3.1)] drive with the solar wind electric field (bottom). The correlation between observed AL (top, black line) and prediction is 87%. The average absolute error is significantly less than the AL standard deviation [37].
phases: storm commencement (counted here as two phases), main phase, and recovery. A first approach is to consider the index as a linear response of the magnetosphere to a solar wind disturbance. One can fit moving-average (MA) filters (also known as finite response or (output) “linear prediction” filters) to time series data of (input) to estimate the ring current response and (2.2) This is the solution of (2.1) when the system starts from quiet conditions (zero-state). The impulse response function represents the effects of many magnetospheric processes (convection, particle injection, etc.) that determine the ring current amplitude once energy is made available to these processes by the solar wind. The length of the impulse response is typically 8–10 h for the ring current. Numerical methods for inverting the convolution (2.2) are described in references of systems theory [30], numerical analysis [26], and geomagnetic studies [6]. The originally introduced by [14] linear prediction filtering of was extended by [32] in double-input single-output filtering. Linear prediction filtering methods have been applied to several space environments in addition to ring current modeling. Also relevant to inner-magnetospheric space weather, but at much higher energies than the ring current, [33] related the
magnetic field at geosynchronous orbit to the next-day fluxes of relativistic electrons. In the case of high-latitude current systems [3] studied the AL index which measures the maximum geomagnetic disturbance associated with the dynamic westward electrojet current [16]. Bargatze et al. [3] obtained the impulse response function for several levels of activity and found it to be a bimodal low-pass filter. The first mode, or peak, represents the so-called “directly driven” response with a 20-min delay. For a range of activity levels a second mode appears at a 1-h delay as a result of the “loading–unloading” (substorm) response. The study showed that the response of the electrojet to the interplanetary input depends on the activity level and must be modeled by a nonlinear model. Nonlinear moving-average models are conceptually similar to nonrecurrent neural networks (see Section IV). The MA filters do not represent any dynamic response to the initial state of the system. This response is modeled by autoregressive moving-average (ARMA) filters. Such a filter for the has the form case of
(2.3)
1948
IEEE TRANSACTIONS ON PLASMA SCIENCE, VOL. 28, NO. 6, DECEMBER 2000
which for and is a discretized version of (2.1). The coefficients of (2.3) can be obtained directly from the time series data [4] and used for prediction as a linear ARMA model [23]. The impulse response of (2.1) can be calculated analytically [17]. A pole analysis gives the characteristic frequencies and growth/decay times corresponding to the large-scale dynamics of the ring current. ARMA filters in general have much higher forecast accuracies than MA filters. By letting the coefficients of (2.3) vary with activity, one can construct a nonlinear ARMA model. The advantage of this is that one can associate a specific physical process with an activity level or a state space region (Fig. 2) and obtain the impulse responses locally. Thus, in order to identify characteristics of the individual processes contributing to the ring current it is useful as a function of the phase to examine the impulse response space region [Fig. 2(c)]. In most storm phases, especially during the long intervals of the recovery phase, the response decays exponentially due to the ring current loss processes. During storm commencement, however, there is an additional fast oscillation as a response to interplanetary pressure pulses; and in the early main phase a slower oscillation represents localized increases of the ring current (particle injections into the current) [38]. The of the solar wind input also varies with state geoeffectiveness space region. In the limiting case of letting the ARMA coefficients adapt to the state and input at every time step, one can construct Kalman-like filters. As the waveform and activity level of the response varies at every time step, the local flow in the state space is modeled by a continually changing set of data points (“nearest neighbor” points in Fig. 1). In the case of AL and AU nonlinear ARMA filters have a further increase in prediction the increase is accuracy than linear ones [37], while for marginal [34], [38]. It should be noted that there are physics-based models whose equations are derived from first principles such as space- averaged MHD equations with a thermodynamic closure [13]. Their coefficients have also been estimated by the output of the models to geomagnetic index time series. Typically, their prediction efficiency is lower than that of the data-based models, probably because the prescribed structure of the equations is a limiting factor. On the other hand, these models may give further insight to the physical causes for the observed variability.
A. Mid-Latitude Geomagnetic Modeling
III. LOCALIZING
THE
DISTURBANCE: SPATIOTEMPORAL MODELS
While activity indices represent the averaged or peak amplitude of a space weather disturbance, in many applications it is also desirable to measure the disturbed region’s position and extent. Below we discuss two types of space weather disturbances that have been modeled. Mid-latitude geomagnetic modeling is usually one-dimensional in configuration (physical) space, while high-latitude geomagnetic and ionospheric modeling is at least two-dimensional. Radiation belt modeling, with a three-dimensional configuration space and many space weather applications, has not been addressed with dynamical methods yet.
As a first example, we will again use mid-latitude geomagnetic disturbances which are related to magnetic storms and the ring current. This is a one-dimensional problem in configuration space, because generally the ring current is asymmetric much more in the longitudinal direction than in the latitudinal one (typically the timescales of latitudinal motion of ring current particles are much faster than the longitudinal ones and can be neglected). The disturbance is measured by a small number of magnetometers distributed in longitude (typically 4, but in case studies as many as 20). The north–south component of the ground mag, corresponds to the westnetic field at the th magnetometer, ward drift of the ring current ions. The longitudinally averaged disturbance is the signature of the symmetric ring current, and at index discussed above. a 1-h resolution it is the For the corrected field, we assume that the ring current is driven by the magnetospheric electric field which we set proportional to the external, solar wind electric field. Then we can measured at each magwrite an ARMA model for the field netometer [38], [39]
(2.4) ) where the state vector has a spatial ( ) and a temporal ( component. For space weather forecasting it is necessary to keep at least zeroth-order lags in , which gives a decaying or growing repexponential for the disturbance. First-order lags resent oscillations of the ring current (particle injections). As in Section II, the coefficients are determined by a comparison of the present state and previous input to states represented in the database, and a representation of the local flow field (Fig. 1) is constructed. The average over all station predictions gives a prewhich is more accurate in predicting the diction observed disturbance than that of the scalar models of Section II [35], [36]. The improvement comes about because the spatiotemporal state vector specifies the geomagnetic conditions more precisely. If the ring current varies slowly in space compared to the average spacing between the stations, we can interpolate the predicted activity to obtain a continuous longitudinal profile. In this way, one can measure the asymmetric component of the current and determine the amplitude and location of the relative effects of slow and bursty enhancements of the ring current (convection versus substorm injections). B. High-Latitude Geomagnetic Modeling While mid-latitude disturbances are useful measures of the global energy dissipation during large storms lasting up to several days, the electrodynamic system of the high-latitude ionosphere can undergo violent changes during substorms which are typically over after 3–4 h. Meridional magnetometer arrays are useful in modeling the two-dimensional disturbance; coverage in the longitudinal direction is made possible by the magnetometer rotation with the Earth under the ionospheric current system. Magnetometer data from a historical database are typically binned in longitude in 1-h-wide local time bins. For each
VASSILIADIS: SYSTEM IDENTIFICATION, MODELING, AND PREDICTION FOR SPACE WEATHER ENVIRONMENTS
1949
Fig. 4. Multi-output ARMA model prediction of midlatitude (ring current-related) geomagnetic disturbances. (a) Interplanetary electric field and pressure. (b) Predicted and observed magnetic disturbance at 4 magnetometer stations (nonlinear prediction: red line; linear: blue). (c) The average of the magnetometer predictions is a prediction for the index (after [35], [36]).
D
1950
IEEE TRANSACTIONS ON PLASMA SCIENCE, VOL. 28, NO. 6, DECEMBER 2000
bin, we can write spatial ARMA models for magnetometers at all latitudes
sitions from one dynamical state to another. At the next stage of the modeling sequence, the predictions of the magnetic field and/or the ionospheric electric field are coupled to geophysical models to obtain the solid Earth response and the geomagnetically induced currents in the earth [40]. The time derivative of the net magnetic field is the curl of the electric field which affects various elements of the power grid system [Kappenman, this volume]. IV. NEURAL NETWORKS
(2.5) The first sum represents the growth/decay of the geomagnetic disturbance (due to a local enhancement of the electrojet and field-aligned currents) and its coefficients determine the relevant time scales; the second sum represents the expansion or other latitudinal motion of the electrojet; while the third sum contains the solar wind-magnetosphere coupling terms, again with a propagation delay. On the second line of (2.5), the upper input is the solar wind electric field used for forecasting with a lead time of 1 h. The lower input is the polar cap (PC) index with a lead time which is short (10 min) compared to the timescales of interest, so in practice it is useful in nowcasting. If the autoregressive terms on the right-hand side of (2.5) are omitted, the model becomes a regression of the magnetometer response against the solar wind driver. For a linear regression, we obtain the geomagnetic part of the ionospheric IZMEM model [24], whereas if the regression is nonlinear and we include the autoregressive terms we get the [36] model. After the geomagnetic activity predicted at the magnetometer positions, it can then be interpolated to a regular grid. The ionospheric current distribution can be obtained by using Ampere’s and Ohm’s law with a model of the ionospheric Hall and Pedersen conductivities [29]. Such geomagnetic dynamical models have been placed on-line and driven with the solar wind electric field measured in near-real-time by the ACE monitoring spacecraft and give forecasts of up to 1 1/2 hours in advance based on the propagation delay of the solar wind from the measurement point to earth. The output of such a nonlinear model (Fig. 5) is the geomagnetic disturbance due to the currents of the ionospheric circuit. The size and position of the disturbed regions [Fig. 5(d) and (e)] can be measured, together with the amplitude of the disturbance [indices, Fig. 5(c)]. Finally the geomagnetic pattern can be separated in mutually orthogonal (uncorrelated) modes by means of principal component analysis (see singular value decomposition in [26]) (2.6) in (2.6) correspond to physically The modes denoted by different current systems driven by large-scale processes such as convection and substorms [31]. The decomposition gives the approximate position of the three major electric current modes (Fig. 6). There are many unresolved issues of magnetospheric/ionospheric response under various solar wind regimes and tran-
Feed-forward neural networks are state space models with a distinct methodology [12]: They operate on a state vector some part of which is the observable output of the system; in that sense, they are similar to nonlinear dynamical systems or prediction filters. Instead of one set, or “layer,” of coefficients, they may have an arbitrary number of “hidden” layers, the lowest of which operates on the state vector data, and each intermediate one operates on the immediately lower one. The connecting functions are arbitrary nonlinear functions; typical choices are sigmoid functions (tanh) or radial basis functions. Given a set of pairs of patterns [or in the case of prediction, a set of pairs of successive states ], the network coefficients, or “weights,” adapt so as to map the one set of patterns to the other. This period of adaptation is called training, or learning. Networks are classified according to the type of learning [12]. Two hidden layers are sufficient to reproduce any smooth nonlinear function; the accuracy depends on the quality of the training data set and the number of weights in each layer. After the training period, the weights are fixed and the neural net model undergoes validation and testing. One of the first applications of neural networks was a space weather study of the inner magnetosphere environment: [19] related the flux of relativistic electrons at the geosynchronous orbit to the overall magnetospheric activity expressed as the geomagnetic index. The study used linear summed daily prediction filters as a reference model, against which the networks gave significantly better results. In the development of the Magnetospheric Specification Model, [9] used neural networks and as well as the loto predict the geomagnetic indices cation of the midnight equatorward boundary of the polar cap, an indicator of the energy stored in the magnetosphere. Gleisner et al. [10] showed that many features of several intervals of the index can be predicted to 76% of the training high-latitude set’s variance. Nonlinear-function networks were more accurate, in terms of the rms error, by 7% than those with linear functions, which formed the reference models in the network evaluation phase. Recently neural networks have been applied to predictions of the foF2 ionospheric index [8], [41]. While the feed-forward neural networks approximate a response to an input in a way similar to nonlinear prediction filters, recurrent neural networks iterate both the output and the input in a manner analogous to autoregressive moving-average filters. Such a type of model (an Elman recurrent network) was used index from several interplanetary parameters to predict the [44]. This and similar studies showed that the best solar wind precursor functions are not the physics-based electric field or power inputs, but rather combinations of the solar wind key parameters. This is important for the optimal choice of precursors
VASSILIADIS: SYSTEM IDENTIFICATION, MODELING, AND PREDICTION FOR SPACE WEATHER ENVIRONMENTS
1951
Fig. 5. Output of a nonlinear dynamical spatiotemporal system for the high-latitude geomagnetic disturbances. (a) Latitude and local-time coordinates (noon is at the top) of the North–South component of the magnetic field. (b) The input to the model is the Polar Cap (PC) geomagnetic index. (c) The AL/AU indices from a reference model (red line) and from two magnetometer arrays of this spatiotemporal model (white and yellow lines). (d) The geographic latitude of the minimum disturbance (dashes) and positive disturbance (crosses). (e) The local time of the minimum and maximum disturbances.
for forecasting. Physical information may be obtained from the neural networks by constructing Volterra kernels (higher order response functions) from them [43].
V. FORECAST METRICS The accuracy of a forecast is measured by a statistical quantity, or “metric.” Although several metrics are widely
1952
IEEE TRANSACTIONS ON PLASMA SCIENCE, VOL. 28, NO. 6, DECEMBER 2000
Fig. 6. Separation of the geomagnetic pattern of the previous figure into principal components, each one related to a current system. (a) The original geomagnetic pattern. (b) The absolute value of the geomagnetic disturbance. (c) The first mode corresponds to the westward electrojet. (d) The second mode corresponds to the eastward electrojet. (e) The third mode is strongly localized and its peak indicates the substorm current wedge. (f) The input to the model is the PC index.
known, they are mentioned here in brief for completeness and in an effort to stress the importance of robust statistics in
error analysis. Several approaches on evaluation of space and tropospheric weather models are described in [7]. Most authors
VASSILIADIS: SYSTEM IDENTIFICATION, MODELING, AND PREDICTION FOR SPACE WEATHER ENVIRONMENTS
1953
stand for the model and its mean, respectively. The correlation is normalized by the standard deviations of observations and predictions. All of the above metrics measure the accuracy of a forecast in an absolute way. To measure the value of a forecast the first step is to compare it against a standard, which is usually a reference model or the space environment’s climatology. A standard metric for the relative accuracy of a forecast is the skill score (2.10) where stands for a reference model (e.g., climatology) [7]. VI. THE ROAD AHEAD: HARNESSING THE FULL POWER OF THE SYSTEMS APPROACH Fig. 7. Prediction of solar wind speed variations with a radial basis neural network based on Wilcox Solar Observatory magnetograms using a radial basis neural network (after [42]).
are familiar with the root-mean-square (rms) error to measure forecast accuracy (2.7) with the forecast, the observation, and the number of preis diction-observation data pairs; the normalization constant the standard deviation of the training set data. This normalized error is also called the average relative variance. Accurate pre. dictions are characterized by The rms error is an optimal statistic when the distribution of prediction errors is Gaussian; this can be shown by the derivation of the least-squares fit. In many realistic cases, however, the distributions of forecasts and/or observations, such as particle fluxes, have finite higher-order statistical moments (for example, they may be skewed or contain long tails). In the more general case, therefore, it is preferable to use the mean absolute error [26]: (2.8) which is more robust, or less sensitive to outliers. More about the significance and different classes of robust statistics can be found in [26]. Both (2.7) and (2.8) require smaller numbers of measurements than the correlation coefficient or the correlation function to be statistically significant. Therefore, the use of error estimates generally provides more detailed information about forecast accuracy than the correlation statistics. In a model validation study, error estimates can be parameterized by prediction time, activity level of the test interval, or physical parameters. While the previous metrics are suitable for forecasts of events or time series, they can be extended to measure forecast accuracy for fields. Such a metric is the anomaly correlation (2.9) is the observation at point at time and Here is an average over the spatial coordinates. Similarly and
In this paper, we described a framework for solving space weather modeling and prediction problems from the perspective of systems analysis. We described ways to classify observational data in a state space and use it to model the time evolution of a space disturbance. The derived model can be used for prediction and real-time forecasting based on a solar or interplanetary input. Such models have been developed for geomagnetic, ionospheric, and other indices. A more complex spatiotemporal modeling is necessary for “localizing” the disturbance and measuring its extent in physical space. The concepts are very similar to the scalar case, however. We gave examples in 1-D and 2-D geometries. We did not discuss the concept of stability (or sensitivity) of the model, but gave references about it (Section I-A). In the future, the increase in the observational data volume provided by the regional and international space programs provides a challenge to modelers. The large databases that will be made available will have to be sifted automatically, initially with neural networks and other recent methods of data mining. However, the complexity of the environments is such that, in a second stage, analytical methods such as nonlinear dynamics and filters will be necessary. As can be seen from the applications cited above, there is a limit in information and quality of forecasts that can be reached with empirical models. A third stage, then, involves the integration of empirical and physical models. An early example of that is SWIFT (Space Weather Ionospheric Forecasting Technologies) [20], a high-latitude ionospheric electrodynamic model which aims to combine several models of parameters of that space environment. Most of the models within SWIFT are empirical, but as future observations allow it, they may be eventually be phased out. Other hybrid models are currently being developed by various groups for the inner magnetosphere and ring current regions. Finally, a significant enhancement of the power of the systems approach will be realized in the stage of data assimilation in which data streams (in some cases archival, but usually near-real-time) are used to drive models. Issues of stability and uniqueness of solutions arise there. Certainly the results obtained by various groups in driving models with solar wind time series are very encouraging. An already existing assimilative method is the Assimilative Model of Ionospheric Dynamics (AMIE) [29]. AMIE was originally derived as a regression model, but there are efforts currently to transform it to a dy-
1954
IEEE TRANSACTIONS ON PLASMA SCIENCE, VOL. 28, NO. 6, DECEMBER 2000
namic, nonlinear model. This and other empirical methods will be used very soon to provide archival and real-time outputs into the numerical framework of a physical model. Data assimilation within hybrid models has been crucial for model development and forecasting in meteorology and oceanography. Most probably the same approach will benefit space weather studies. ACKNOWLEDGMENT The author wishes to thank D. Baker, A. Klimas, H. Lundstedt, R. McPherron, and S. Sharma for discussions on this subject. REFERENCES [1] H. D. I. Abarbanel, R. Brown, J. J. Sidorowich, and L. S. Tsimring, “The analysis of observed chaotic data in physical systems,” Rev. Mod. Phys., vol. 65, no. 4, pp. 1331–1392, 1993. [2] D. N. Baker, T. I. Pulkkinen, X. Li, S. G. Kanekal, K. W. Ogilvie, R. P. Lepping, J. B. Blake, L. B. Callis, G. Rostoker, H. J. Singer, and G. D. Reeves, “A strong CME-related magnetic cloud interaction with the Earth’s magnetosphere: ISTP observations of rapid relativistic electron acceleration on May 15, 1997,” Geophys. Res. Lett., vol. 25, no. 15, pp. 2975–2978, 1998. [3] L. F. Bargatze, D. N. Baker, R. L. McPherron, and E. W. Hones, Jr., “Magnetospheric impulse response for many levels of geomagnetic activity,” J. Geophys. Res., vol. 90, no. A7, pp. 6387–6394, 1985. [4] R. K. Burton, R. L. McPherron, and C. T. Russell, “An empirical rela,” J. Geophys. Res., tionship between interplanetary conditions and vol. 80, no. 31, pp. 4204–4214, 1975. [5] M. Casdagli and S. Eubank, Nonlinear Modeling and Forecasting. Reading, MA: Addison-Wesley, 1992. [6] C. R. Clauer, “The technique of linear prediction filters applied to studies of solar wind-magnetosphere coupling, 39–57,” in Solar Wind-Magnetosphere Coupling, Y. Kamide and J. A. Slavin, Eds. Terra Scientific, Tokyo: Astrophysics and Space Science Library 126, 1986. [7] K. A. Doggett, The Evaluation of Space Weather Forecasts. Proc. Workshop at Boulder, Colorado. Boulder, CO, June 19–21, 1996, NOAA/SEC. [8] N. M. Francis, P. S. Cannon, A. G. Brown, and D. S. Broomhead, “Nonlinear prediction of the ionospheric parameter foF2 on hourly, daily, and monthly timescales,” J. Geophys. Res., vol. 105, no. A6, pp. 12 839–12 849, 2000. [9] J. W. Freeman, A. Nagai, P. Reiff, W. Denig, S. Gussenhoven-Shea, M. Heinemann, F. Rich, and M. Hairston, “The use of neural networks to predict magnetospheric parameters for input to a magnetospheric forecast model,” in Artificial Intelligence Applications in Solar Terrestrial Physics, J. Joselyn, H. Lundstedt, and J. Trollinger, Eds. Boulder, CO: NOAA, 1994. [10] H. Gleisner and H. Lundstedt, “The response of the auroral electrojets to the solar wind modeled with neural networks,” J. Geophys. Res., vol. 102, no. 7, pp. 14 269–14 278, 1997. [11] A. Hasegawa, “Self-organization processes in continuous media,” Adv. Phys., vol. 34, no. 1, pp. 1–42, 1985. [12] J. Hertz, A. Krogh, and R. G. Palmer, Introduction to the Theory of Neural Computation. Reading, MA: Addison-Wesley, 1991. [13] W. Horton and I. Doxas, “A low-dimensional dynamical model for the solar wind-driven geotail-ionosphere system,” J. Geophys. Res., vol. 103, pp. 4561–4572, 1998. [14] T. Iyemori, H. Maeda, and T. Kamei, “Impulse response of geomagnetic indices to interplanetary magnetic field,” J. Geomagn. Geoelectr., vol. 31, pp. 1–9, 1979. [15] S. G. Kanekal, D. N. Baker, J. B. Blake, B. Klecker, R. A. Mewaldt, and G. M. Mason, “Magnetospheric response to magnetic cloud (coronal mass ejection) events: Relativistic electron observations from SAMPEX and Polar,” J. Geophys. Res., vol. 104, no. A11, pp. 24 885–24 894, 1999. [16] M. Kivelson and C. T. Russell, Introduction to Space Physics: Cambridge, 1995. [17] A. J. Klimas, D. Vassiliadis, D. N. Baker, and D. A. Roberts, “The organized nonlinear dynamics of the magnetosphere,” J. Geophys. Res., vol. 101, no. A6, pp. 13 089–13 113, 1996. [18] A. J. Klimas, D. Vassiliadis, and D. N. Baker, “Dst index prediction using data-derived analogues of the magnetospheric dynamics,” J. Geophys. Res., vol. 103, pp. 20 435–20 447, 1998.
D
[19] H. C. Koons and D. J. Gorney, “A neural network model of the relativistic electron flux at geosynchronous orbit,” J. Geophys. Res., vol. 96, pp. 5549–5556, 1991. [20] N. C. Maynard, D. Vassiliadis, J. W. Freeman, D. N. Baker, and G. L. Siscoe, “Prediction of magnetic activity induced hazards to ground and space assets,” NSF SBIR proposal Phase I final report, DMI-9461857, 1995. [21] “The national space weather program: The strategic plan,” Office of the Federal Coordinator for Meteorological Services and Supporting Research, Washington, DC, FCM-P30-1995, 1995. [22] “The National space weather program: The implementation plan,” Office of the Federal Coordinator for Meteorological Services and Supporting Research, Washington, DC, FCM-P31-1997, 1997. [23] T. P. O’Brien and R. L. McPherron, “An empirical phase space analysis of ring current dynamics: Solar wind control of injection and decay,” J. Geophys. Res., vol. 105, no. A4, pp. 7707–7719, 2000. [24] V. Papitashvili, B. A. Belov, D. S. Faermark, Y. I. Feldstein, S. A. Golyshev, L. I. Gromova, and A. E. Levitin, “Electric potential patterns in the northern and southern polar regions parameterized by the interplanetary magnetic field,” J. Geophys. Res., vol. 99, no. A7, pp. 13 251–13 262, 1994. [25] G. P. Pavlos, M. A. Athanasiu, D. Kugiumtzis, N. Hatzigeorgiou, A. G. Rigas, and E. T. Sarris, “Nonlinear analysis of magnetospheric data—Part I: Geometric characteristics of the AE index time series and comparison with nonlinear surrogate data,” Nonlin. Proc. Geophys., vol. 6, no. 1, pp. 51–65, 1999. [26] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. V. Vetterling, Numerical Recipes: The Art of Scientific Computing. Cambridge: Cambridge Univ. Press, 1993. [27] T. I. Pulkkinen, D. N. Baker, L. A. Frank, J. B. Sigwarth, H. J. Opgenoorth, R. Greenwald, E. Friis-Christensen, T. Mukai, R. Nakamura, H. Singer, G. D. Reeves, and M. Lester, “Two substorm intensifications compared: Onset, expansion, and global consequences,” J. Geophys. Res., vol. 103, no. A1, pp. 15–27, 1998. [28] F. J. Rich and N. C. Maynard, “Consequences of using simple analytical functions for the high-latitude convection electric field,” J. Geophys. Res., vol. 94, no. 4, pp. 3687–3701, 1989. [29] A. D. Richmond, G. Lu, B. A. Emery, and D. J. Knipp, “The AMIE procedure: Prospects for space weather specification and prediction,” Adv. Space Res., vol. 22, no. 1, pp. 103–112, 1998. [30] W. J. Rugh, Linear Systems Theory. Englewood Cliffs, NJ: PrenticeHall, 1993. [31] W. Sun, W.-Y. Xu, and S.-I. Akasofu, “Mathematical separation of directly driven and unloading components in the ionospheric equivalent currents during substorms,” J. Geophys. Res., vol. 103, no. A6, pp. 11 695–11 700, 1998. [32] K. J. Trattner and H. O. Rucker, “Linear prediction theory in studies of solar wind-magnetosphere coupling,” Ann. Geophys., vol. 8, no. 11, pp. 733–738, 1990. [33] A. Tsutai, C. Mitsui, and T. Nagai, “Prediction of a geosynchronous electron environment with in situ magnetic field measurements,” Earth Plan. Space, vol. 51, no. 3, pp. 219–223, 1999. [34] J. A. Valdivia, A. S. Sharma, and K. Papadopoulos, “Prediction of magnetic storms by nonlinear models,” Geophys. Res. Lett., vol. 23, no. 21, pp. 2899–2902, 1996. [35] J. A. Valdivia, D. Vassiliadis, A. J. Klimas, and A. S. Sharma, “The spatio-temporal activity of magnetic storms,” J. Geophys. Res., vol. 104, no. A6, pp. 12 239–12 250, 1999. [36] J. A. Valdivia, D. Vassiliadis, A. J. Klimas, and A. S. Sharma, “Modeling the spatial structure of the high-latitude magnetic perturbations and the related current systems,” Phys. Plasmas, vol. 6, no. 11, pp. 4185–4194, 1999. [37] D. Vassiliadis, A. J. Klimas, D. N. Baker, and D. A. Roberts, “A description of solar wind-magnetosphere coupling based on nonlinear filters,” J. Geophys. Res., vol. 100, no. A3, pp. 3495–3512, 1995. [38] D. Vassiliadis, A. J. Klimas, J. A. Valdivia, and D. N. Baker, “The Dst geomagnetic response as a function of storm phase and amplitude and the solar wind electric field,” J. Geophys. Res., vol. 104, no. A11, pp. 24 957–24 976, 1999. [39] D. Vassiliadis, D. N. Baker, H. Lundstedt, and R. C. Davidson, “Special topic: Nonlinear methods in space plasma physics: Papers presented at the 1998 fall meeting of the American geophysical union—Foreword,” Phys. Plasmas, vol. 6, no. 11, pp. iii–iv, 1999. [40] A. Viljanen, O. Amm, and R. Pirjola, “Modeling geomagnetically induced currents during different ionospheric conditions,” J. Geophys. Res., vol. 104, no. A12, pp. 28 059–28 071, 1999.
VASSILIADIS: SYSTEM IDENTIFICATION, MODELING, AND PREDICTION FOR SPACE WEATHER ENVIRONMENTS
[41] P. Wintoft and L. R. Cander, “Twenty-four hour predictions of f(o)F(2) using time delay neural networks,” Radio Sci., vol. 35, no. 2, pp. 395–408, 2000. [42] P. Wintoft and H. Lundstedt, “A neural network study of the mapping from solar magnetic fields to the daily average solar wind velocity,” J. Geophys. Res., vol. 104, no. A4, pp. 6729–6736, 1999. [43] J. Wray and G. G. R. Green, “Calculation of the Volterra kernels of nonlinear dynamic systems using an artificial neural network,” Biological Cybernetics, vol. 71, no. 3, pp. 187–195, 1994. [44] J.-G. Wu and H. Lundstedt, “Geomagnetic storm predictions from solar wind data with the use of dynamic neural networks,” J. Geophys. Res., vol. 102, no. A7, pp. 14 255–14 268, 1997.
1955
Dimitrios Vassiliadis graduated from the University of Thessaloniki, Greece and received the Master’s (1989) and Ph.D. (1992) degrees in physics from the University of Maryland, College Park. He has since worked at NASA’s Goddard Space Flight Center in Greenbelt, MD, originally as a National Research Council Associate and then with Universities Space Research Association. He is active in space physics and in particular modeling and forecasting of the near-Earth space environment (solar wind-magnetosphere coupling, high-latitude current systems, ring current energization). Additionally he has worked on nonlinear dynamics, control theory, and statistical physics.