Eigenskies: A Method of Visualizing Weather

0 downloads 0 Views 322KB Size Report
Visualizing a weather prediction data set by actually synthesizing an image of the sky is a difficult .... Photographs, capturing the same scene every hour.
Eigenskies: A Method of Visualizing Weather Prediction Data Bj¨orn Olsson, Anders Ynnerman and Reiner Lenz ITN, Link¨oping University, 601 74 Norrk¨oping, Sweden. ABSTRACT Visualizing a weather prediction data set by actually synthesizing an image of the sky is a difficult problem. In this paper we present a method for synthesizing realistic sky images from weather prediction and climate prediction data. Images of the sky are combined with a number of weather parameters (like pressure and temperature) to train an artificial neural network (ANN) to predict the appearance of the sky from certain weather parameters. Hourly measurements from a period of eight months are used. The principal component analysis (PCA) method is used to decompose images of the sky into their eigen components - the eigenskies. In this way the image information is compressed into a small number of coefficients while still preserving the main information in the image. This means that the fine details of the cloud cover cannot be synthesized using this method. The PCA coefficients together with measured weather parameters at the same time form a data point that is used to train the ANN. The results show that the method gives adequate results and although some discrepancies exist, the main appearance is correct. It is possible to distinguish between different types of weather. A rainy day looks rainy and a sunny day looks sunny. Keywords: Artificial neural networks, Principal Component Analysis, Weather Prediction Data, Image Synthesis

1. INTRODUCTION Weather prediction simulations result in massive amounts of data. There is a strong need to analyze this data and over the past decades there has been a large number of papers addressing visual methods and systems for weather visualization. Depending on the intended use of the visualization we can identify two main approaches to weather data visualization. The first one is constituted by the more traditional tools aimed at the creation of maps, 2D and 3D, for weather forecasting. Examples of such tools are Vis5D1 which is a standard 3D tool for visualizing meteorological data and Trivis2 which is used in the creation of maps for broadcasted weather forecasts. The second approach is to generate realistic images of an environment during given weather conditions. Generation of realistic looking clouds is an example of this approach and has attracted a lot of attention in the past decade. Animated clouds have been the interest of, e.g. Dobashi3, 4 and others.5–7 The generation of synthetic skies taking into account the scattering of light in the atmosphere has been developed by Klassen.8 Recently methods for synthesizing clouds using real-time volume rendering taking subsurface scattering effects into account have been developed.9 The approach presented in this paper is using methods inspired by image analysis and synthesis. This is a novel approach which is based on the analysis of real photographs of skies and creation of synthetic images using the decomposed real images. To develop this as a tool for meteorlogists and other users is an important task. The results from this work can also be of use in other branches such as the game industry and when creating virtual worlds. The goal of this work is to develop a general method for viewing weather prediction files by synthesizing a view of the actual scene. So far we have been concentrating on the appearance of the sky. Efforts have been made to solve parts of this before. Examples are extreme weather conditions for the movie industry and generation of fractal clouds for use in weather reports ,10, 11 but until now, to our knowledge, no work has been made to synthesize a general view of a scene from a weather prediction file. Further author information: (Send correspondence to B.O.) B.O.: E-mail: [email protected], Telephone: +46 11 363196

Visualization and Data Analysis 2003, Robert F. Erbacher, Philip C. Chen, Jonathan C. Roberts, Matti T. Gröhn, Katy Börner, Editors, Proceedings of SPIE-IS&T Electronic Imaging, SPIE Vol. 5009 (2003) © 2003 SPIE-IS&T · 0277-786X/03/$15.00

Downloaded from SPIE Digital Library on 30 Sep 2010 to 130.236.136.33. Terms of Use: http://spiedl.org/terms

1

Inputs

p

General Neuron

w

n

Σ

f

a

b

a=f(wp+b)

Figure 1. Single-Input Neuron. The result a is computed by applying an activation function f on the sum of the input p multiplied with the weight w and the bias b.

2. METHOD The main idea behind the method chosen is to construct a model which is trained with examples to synthesize an image at general conditions instead of explicitly designing the appearance of the sky under different weather descriptions. This method uses the Principal Component Analysis (PCA) to compress the data and acquire the most important features of the images. This method was inspired by the eigenface12, 13 method. Artificial neural networks (ANN) are used as a statistical tool for function approximation. In this case we use a combination of PCA and neural networks with supervised training. Another solution would be to send the images directly to an ANN and combine them with the weather parameters using unsupervised training. In that case, at least in theory, the network would find similar eigenskies, but connecting the eigenskies to the weather parameters would be much more difficult since the number of parameters would be much larger. Alternatives for PCA include the CCA (Canonical Correlation Analysis) which computes the components with maximum auto-correlation under the constraint that they are mutually uncorrelated and ICA (Independent Component Analysis). The PCA method was our first choice and this method succeeded very well in separating sharp components.

2.1. Principal Component Analysis The images are decomposed using principal component analysis. The pixels in an image are rearranged into ¯T . E[] means the expectation value of a stochastic a vector x ¯ = [x1 , x2 , . . . , xn ]. The transpose is denoted x x). We then define the covariance matrix Cxx = E((¯ x− variable. The vector mx is defined as mx = E(¯ mx ) ∗ (¯ x − mx )T ) We also define the eigenvalue problem Cxx V = VD, where V is a matrix with each row an eigenvector and D is a diagonal matrix with the eigenvalues along the main diagonal. The size of the covariance matrix becomes impractically large for images of the resolution needed in this project. A way of decreasing the matrix size is to use dimensionality reduction.14 Let Y be rectangular with each row consisting of a vector. The matrix has the same height as the number of vectors (N). The eigenvalue problem, Cxx V = VD, can be denoted by Cxx =

1 T 1 Y Y ⇒ 2 YT YV = VD N2 N

(1)

By multiplying with Y from the left and changing variable W = YV the size of the covariance matrix is reduced from the vector length squared to the number of vectors squared. 1 1 YYT YV = YVD ⇒ 2 YYT W = WD. N2 N

(2)

For a more detailed description of PCA see Joliffe.15

2.2. Artificial Neural Networks Artificial neural network is a collective name for a number of connected but in many cases very different methods. Neural networks originate from neuroscience as a historical method of simulating the brain neurons. 2

Proc. of SPIE Vol. 5009

Downloaded from SPIE Digital Library on 30 Sep 2010 to 130.236.136.33. Terms of Use: http://spiedl.org/terms

This technology has found applications in diverse areas such as pattern recognition and signal processing. For a more thorough description of this subject we refer to Hagan16 or Haykin.14 In this paper we use ANN methods for function approximation. In the simplest form an ANN consists of one neuron (See Figure 1). The scalar input p is multiplied by the scalar weight w and then added to parameter b. The summer output n, the net output, goes into the activation function f . The neuron output is calculated as a = f (wp + b). Both w and b are adjustable parameters of the neuron. Parameters w and b are adjusted by some learning rule. The activation function f may be linear or nonlinear. Examples of activation functions include the hard-limit activation function (also called the step function, f (x) = {0, x < 0; 1, x >= 0}), the linear activation function and the log-sigmoid activation function (f (x) = 1+e1−x ). A single layer network consists of several neurons all connected to the same input-signals. The mathematical formulation is identical to the single-neuron case, a ¯ = f (Wp¯ + ¯b), but now W is a matrix and a ¯, p¯ and ¯b are vectors. Each element of the input-vector p¯ is connected to each neuron through the weight matrix W. A multiple layer network is built by connecting several single layer networks. In this paper we use two layers. The two-layer network has the shape a ¯ = f2 (W2 f1 (W1 p¯ + b1 ) + b2 ).

(3)

The first layer is called the hidden layer and the units of this layer are called the hidden units. The second layer is called the output layer. In the two-layer case matrices W1 and W2 and vectors b1 and b2 are the adjustable parameters. Many different training methods exist but most are variations of the backpropagation algorithm (See Le Cun17 or Rumelhart18 ). The algorithm is provided with a set of examples of proper network behaviour: (p1 , t1 ), (p2 , t2 ), . . . , (px , tx ), . . . , (pn , tn ),

(4)

where px is an input to the network and tx is the corresponding target output. As each input is applied to the network the network output is compared to the target. The algorithm adjusts the network parameters in order to minimize the mean square error. In the backpropagation algorithm these sensitivities are propagated backward through the network. The Levenberg-Marquardt algorithm19 for training neural networks is a variation of Newton’s method. Data from three different sources were used. Photographs, capturing the same scene every hour. Weather prediction data calculated at the same time interval as the captured images. Direct measurements of weather parameters. These were measured at the same time as the images were captured. The image data base and a data base of weather parameters were combined to train a mathematical model to synthesize an accurate view of the sky for a general set of weather parameters. When training the networks (See the lower part of Figure 2) the method consists of the following steps. First we compute the eigenskies using all images in the data base. A large number of images at different weather conditions from a large time interval is needed in order to compute well-defined eigenskies. Every image is transformed to a PCA-coefficient vector by projecting the image to the most important eigenskies and saving the actual coefficient. This vector is combined with a weather parameter vector to form a data pair which is used to train the model. By using PCA the image data is compressed to a small number of coefficients. The first part of a data pair is an example vector of the input which in this case is a vector with the actual weather parameters at a certain time. The second part is the expected output at this input vector, in this case a small number of coefficient vectors. Twentyfour neural networks form the model. One network is used for each hour of the day, but the eigenskies are the same for all the networks. The database is divided into twentyfour parts and each neural network is trained with data from one specified hour of the day. All networks are optimized using the available data. The data set is divided in two parts. Most of the data is used, as a training set, to train the neural networks, but a number of data points are used, as a test set, to verify the output result. When the networks are used to synthesize images (See the upper part of Figure 2) the process begins by sending a weather parameter vector to the model. Either predicted or measured data can be used. One of the neural networks, depending on which time of the day the data point belongs to, is used to predict the actual PCA-coefficient vector. The vector is used in combination with the eigenskies to compute the resulting image. This image represents a view of the weather at this specific data point. Proc. of SPIE Vol. 5009

Downloaded from SPIE Digital Library on 30 Sep 2010 to 130.236.136.33. Terms of Use: http://spiedl.org/terms

3

Synthesizing T p date dT/dt dp/dt hour

Input layer

a b c d

Hidden layer

c0 c1 c2 c3 c4 c5 c6 c7 c8 c9

PCA

Image

Output layer

PCA

Image

Training T p date dT/dt dp/dt hour

Input layer

a b c d

Hidden layer

c0 c1 c2 c3 c4 c5 c6 c7 c8 c9

PCA

Image

Output layer

PCA

Image

Figure 2. The upper part of the figure visualizes the synthesizing process. A number of weather parameters, in this case temperature, pressure, date, hour, time-derivative of pressure and time-derivative of temperature are sent to the neural networks which approximate the most important PCA-coefficients according to the actual weather conditions. These coefficients are used to synthesize the resulting sky image using the eigenskies.

3. RESULTS 3.1. Data The images used in the computations were captured by a consumer digital camera (Canon Powershot G2). An example image can be seen in Figure 3. The camera was configured to postprocess the images as little as possible. Information about shutter speed and aperture were saved separately. The number of images used were 7000. The original images consisted of high-resolution photographs. In such an image the upper part (the sky) was used which was down-scaled to a size of 232x62 pixels to lower the memory-demand. The ten most important eigenskies were used in the following calculations. As an example of the image quality available with this amount of compression can be seen in Figure 6. The images to the left are the uncompressed images and the images to the right are compressed to ten scalar values and then resynthesized using the ten most important eigenskies. The weather prediction data was computed using the software package HIRLAM (For more information concerning this software-package see Cats and Wolters20 ). A weather forecast is computed every six hours. The intermedial values were interpolated. Data points are positioned in an irregular grid with an average grid-size of 22 km. The grid covers Scandinavia and north-eastern Europe. For each geographical position 53 parameters are available. Among them the geopotential at different pressure levels, temperatures at different pressure levels and humidity at ground level. The weather prediction files are very similar to climate prediction files computed by the related software package, RCA (See Rummukainen et al21 ). We investigated many combinations of weather parameters to find the most appropriate for synthesizing the sky. The drawback with weather prediction files, although of good quality, is that all parameters are not correct. About five percent have inaccurate values. These errors may hide the true trends, meaning it becomes more difficult to train the networks. Therefore we chose to use direct measurements for the training in this work. The direct measured parameters are measured by an automatic weather station every hour at a position situated two kilometers from the camera. Ten parameters are measured. These include air pressure at ground level, temperature and humidity. 4

Proc. of SPIE Vol. 5009

Downloaded from SPIE Digital Library on 30 Sep 2010 to 130.236.136.33. Terms of Use: http://spiedl.org/terms

3.2. The resulting images We computed the eigenskies from the images. The eigenskies are shown in Figure 4. The eigenvalues can be seen in Figure 5. The diagram is drawn in logarithmic scale and as can be seen the first eigenvalue is considerably larger than the following. For every hour an ANN was trained to predict the eigensky coefficients from the temperature, pressure, time, time derivative of temperature, time derivative of pressure and date entries of the measurement vector. Twentyfour neural networks were used to simplify the training process of the individual networks. All images were transformed to coefficient vectors using the ten most important eigenskies. The twolevel network used had a linear activation function on the hidden-layer and a log-sigmoid activation function on the second layer. The network training method used was the Levenberg-Marquardt method. All networks were trained until no further change could be seen when training the network. The number of hidden units was varied and it was found that four hidden units gave a good compromise between flexibility and the risk of over-training. We have developed a software package in matlab to perform these steps of computing the eigenskies and training an ANN to synthesize images. The synthesizing part can be used as a stand-alone module together with the computed eigenskies to predict sky-images from a parameter vector. To exemplify the accuracy of the method two image series are presented. The images were synthesized from the test set. These parameters were not used to train the artificial neural networks. The resulting synthesized images can be seen in Figures 7 and 9. The actual view for the same days can be seen in Figures 8 and 10.

4. DISCUSSION The resulting images can be seen in Figures 7 and 9 and the corresponding actual views for the same days are shown in Figures 8 and 10 respectively. Each of the twentyfour images is the synthesized response for a certain hour. As can be seen from the first data set the method catches the main appearance very well. The fine cloud details can not be seen, but the movement of the sun is modelled correctly. The result in the second series is a little less accurate. The synthesized images are much more grey, meaning that this probably was outside the trained region of the neural networks. The image synthesized at the eleventh position looks improbable. In a few cases the model fails to synthesize a correct view. This mostly happens when extreme input data is sent to the network. Since the network may be untrained with this type of data the result is undefined. The resulting images when sending a parameter vector from a very sunny day to the network may be contrary to the reality. For example the image synthesized from a data vector from a day with intense sunlight may look gloomy. This mostly happens when using input data not exemplified in the training data set. As can be seen in Figure 8 the clouds disappear and the day is clear just before dusk. This is also predicted by the networks.

5. CONCLUSION The synthesized images look real and it is possible to distinguish between different types of weather in the resulting images. The synthesized images from a clear day are more accurate than images from a cloudy day as the fine cloud details can not be synthesized correctly. By training neural networks with input data from a long time period the networks may be used to synthesize an image of the sky. Even when only using weather information from one data point near the ground the resulting images look accurate.

6. FUTURE WORK Future work include increasing the precision of the synthesized images by increasing the number of parameters used when training the networks. Both by using several data points and by using information at higher altitudes. Another future task is to increase the cloud details at various weather conditions.

ACKNOWLEDGMENTS The authors wish to thank Stefan Gustavson for interesting discussions. This research was supported by SMHI (Swedish Meteorological and Hydrological Institute) and NGSSC (National Graduate School in Scientific Computing). We thank SMHI for making the weather prediction data available. R. Lenz acknowledges the financial support of CENIIT, the Center for Industrial Information Technology, Link¨ oping University, Sweden. Proc. of SPIE Vol. 5009

Downloaded from SPIE Digital Library on 30 Sep 2010 to 130.236.136.33. Terms of Use: http://spiedl.org/terms

5

REFERENCES 1. B. Hibbard and D. Santek, “The vis-5d system for easy interactive visualization,” in Proceedings of the First IEEE Conference on Visualization, pp. 28–35, 462, 1990. 2. H. Haase, M. Bock, E. Hergenrother, C. Knopfle, H. J. Koppert, F. Schroder, a. Trembilski, and J. Weidenhausen, “Meteorlogy meets computer graphics - look at a wide range of weather visualizations for diverse audiances,” in Computer & Graphics, 24, pp. 391–397, 2000. 3. Y. Dobashi, K. Kaneda, H. Yamashita, T. Okita, and T. Nishita, “A simple efficient method for realistic animation of clouds,” in Siggraph 2000, Computer Graphics Proceedings, Annual Conference Series., pp. 19–28, 2000. 4. Y. Dobashi, T. Nishita, H. Yamahsita, and T. Okita, “Modelling of clouds from satelite images using metaballs,” in Proc. Pacific Graphics, pp. 53–60, 1998. 5. M. J. Harris and A. Lastra, “Real-time cloud rendering,” in Eurographics 2001., 20, 2001. 6. T. Nishita, Y. Dobashi, and E. Nakamae, “Display of clouds taking into account multiple anisotropic scattering and sky light,” Proc. SIGGRAPH 8, pp. 379–386, 1996. 7. F. Neyret, “Qualitative simulations of convective cloud formation and evolution,” Eurographics Workshop on Animation and Simulation , 1997. 8. R. V. Klassen, “Modelling the effect of the atmosphere on light,” in ACM Transactions on Graphics, 6, pp. 215–237, 2001. 9. J. Kniss, S. Promoze, C. Hansen, and D. Ebert, “Interactive translucent volume rendering and procedural modeling,” Proc. IEEE Visualization , pp. 109–116, 2002. 10. A. Trembilski, “Two methods for cloud visualization from weather simulation data,” in WSCG 2000 Conference Proceeding., 2000. 11. G. Sakas, Fraktale Wolken, virtuelle Flammen. Computer-Emulation und Visualisierung turbulenter Gasbewegung, Springer, Berlin, 1993. 12. M. A. Turk and A. P. Pentland, “Face recognition using eigenfaces,” J. Cognitive Neurosci. 3, pp. 71–86, 1991. 13. P. McGuire and G. M. T. D’Eleuterio, “Eigenpaxels and a neural-network approach to image classification,” IEEE Transactions on neural networks 12, pp. 625–635, 2001. 14. S. Haykin, Neural Networks, Prentice Hall, New Jersey, 1999. 15. I. T. Joliffe, Principal Component Analysis, Springer. 16. M. T. Hagan, H. B. Demuth, and M. Beale, Neural Network Design, PWS Publishing Company, Boston, 1995. 17. Y. L. Cun, “Une procedure d’apprentissage pour reseau a seuil assymmetrique,” in Cognitiva, 85, pp. 599– 604, 1985. 18. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” in Nature, 323, pp. 533–536, 1986. 19. L. E. Scales, Introduction to Non-Linear Optimization, Springer-Verlag, New York, 1985. 20. G. Cats and L. Wolters, “The hirlam project,” in IEEE Computational Science and Engineering, 4, pp. 22– 31, 1997. 21. M. Rummukainen, J. Raisanen, A. Ullerstig, B. Bringfelt, U. Hansson, P. Graham, and U. Willen, RCA Rossby Centre regional atmosphere climate model: model description and results from the first multi-year simulation, Swedish Meteorological and Hydrological Institute, Norrkoping, 1998.

6

Proc. of SPIE Vol. 5009

Downloaded from SPIE Digital Library on 30 Sep 2010 to 130.236.136.33. Terms of Use: http://spiedl.org/terms

16th April. 16:00

16th April. 16:00

Figure 3. The image to the left is an example of one of the original images of size 1700x2320 pixels. All images used are captured of exactly the same scene. The upper part of the image, the right figure, is used in the training process. This has a size of 620x2320 pixels. Number: 1.

Number: 2.

Number: 3.

Number: 4.

Number: 5.

Number: 6.

Number: 7.

Number: 8.

Number: 9.

Number: 10.

Figure 4. These images are the eigenskies which are used to synthesize the resulting images. The most important eigensky, in the upper left corner, is a blue sky, slightly varying. This means that mostly the sky is blue. The next image, in the upper right, is of an orange color. Note the small artifacts in the middle right of the images. They depend on internal reflections inside the camera on days with intense sunlight. All eigenskies are smoothly varying with no sharp edges. 30

The logarithm of the actual value.

28

26

24

22

20

18

0

5

10 15 The most important eigenvalues.

20

25

Figure 5. This is a distribution of the most important eigenvalues. It is drawn in logarithmic scale due to the large scale-differences between the different eigenvalues. Proc. of SPIE Vol. 5009

Downloaded from SPIE Digital Library on 30 Sep 2010 to 130.236.136.33. Terms of Use: http://spiedl.org/terms

7

Figure 6. The images to the left were original images which were projected onto the ten most important eigenskies. The resulting coefficients, the PCA coefficients, represents a highly compressed representation of the original images. The images to the right were synthesized by reconstructing images from the ten coefficients using the eigenskies. As can be seen the major appearance of the images are preserved, but the detail is lost. The cloud details are lost. This represents the maximum accuracy of the method.

8

Proc. of SPIE Vol. 5009

Downloaded from SPIE Digital Library on 30 Sep 2010 to 130.236.136.33. Terms of Use: http://spiedl.org/terms

020329

Figure 7. These are the synthesized images of the 29th of March. This day was not included in the training set. As can be seen the main appearance is synthesized correctly. The movements of the sun can be seen as a variation in the blue sky.

2054

2055:020329

2056

2057

2058

2059

2060

2061

2062

2063

2064

2065

2066

2067

2068

2069

2070

2071

2072

2073

2074

2075

2076

2077

Figure 8. These original images were captured on the 29th March. One image was captured every hour meaning a whole day and night is represented by twentyfour images. As can be seen this day was almost cloudless. Note the small artifacts in the middle right image. They depend on internal reflection inside the camera. Proc. of SPIE Vol. 5009

Downloaded from SPIE Digital Library on 30 Sep 2010 to 130.236.136.33. Terms of Use: http://spiedl.org/terms

9

020725

Figure 9: These are the synthesized images of the 25th July. This day was not included in the training set.

4910

4911:020725

4912

4913

4914

4915

4916

4917

4918

4919

4920

4921

4922

4923

4924

4925

4926

4927

4928

4929

4930

4931

4932

4933

Figure 10: These images were captured the 25th July. This day was very cloudy with a varying cloud cover

10

Proc. of SPIE Vol. 5009

Downloaded from SPIE Digital Library on 30 Sep 2010 to 130.236.136.33. Terms of Use: http://spiedl.org/terms

Suggest Documents