Data Modeling of deep sky images James Handley*a, Holger Jaenischa,b, Albert Lima, Graeme Whitea, Alex Honsa, Miroslav Filipovicc,a, Matthew Edwardsb a James Cook University, Centre for Astronomy, Townsville QLD 4811, Australia b Alabama Agricultural and Mechanical University, Department of Physics, Huntsville, AL 35811 c University of Western Sydney, Locked Bag 1797 PENRITH SOUTH DC NSW 1797, Australia ABSTRACT We present a method for simulating CCD focal plane array (FPA) images of extended deep sky objects using Data Modeling. Data Modeling is a process of deriving functional equations from measured data. These tools are used to model FPA fixed pattern noise, shot noise, non-uniformity, and the extended objects themselves. The mathematical model of the extended object is useful for correlation analysis and other image understanding algorithms used in Virtual Observatory Data Mining. We apply these tools to the objects in the Messier list and build a classifier that achieves 100% correct classification. Keywords: Data Modeling, Virtual Observatory, astronomy, deep-sky, image modeling, noise modeling, Messier classification, functional modeling, component modeling
1. INTRODUCTION Data Modeling1,2 is the process of deriving a mathematical expression from measured data. Two approaches for modeling focal plane array (FPA) images are component based modeling and functional modeling. In component based modeling, the deep sky object and each noise effect are modeled independently. Arrays for each effect are stacked into a final image. Component based modeling uses simple equations to model deep sky objects and noise effects, or it can use statistical models. In contrast, functional modeling of deep sky objects is done strictly using univariate or multivariate equations. These equations are continuous and their nth order derivatives are available for analysis. Functional modeling also provides a common basis for robust image analysis.
2. COMPONENT BASED DATA MODELING Component based Data Modeling of deep sky objects and noise effects (thermal, shot, fixed pattern, FPA nonuniformity, and hot/dead pixels) are generated independently. Each independent model is stacked to build a Component Based Image Data Model. The Moffat function3
( x − x0 )α + ( y − y 0 ) γ f ( x, y ) = + 1 ρ
−β
(1)
is used for modeling point-spread functions. The control parameters α, β, γ, and ρ modify the size, shape, and extent of the 2-D distribution. Deep sky objects are modeled using a series of Moffat functions and properly selecting coefficients. This Data Model of M51 used 2 Moffat functions with coefficients obtained using a genetic algorithm (GA)4 yielding
*
[email protected]; phone 1 256 337 3769; James Cook University; Centre for Astronomy; Townsville, QLD 4811; AUSTRALIA.
( x − 35)1.55 + (( y − 20)(0.707 ))1.55 f ( x , y ) = 3.0 + 1 0.6
−0 .5
( x − 28)1.4 + (( y − 45)( −0.342))1.4 + 1.5 + 1 0.6
−0 .6
(2)
Next, noise effects on the FPA and the surrounding field stars are modeled using different types of random number distributions. Use of random numbers yield statistical rather than functional models. To eliminate random number generators, we propose using sin xe (Sxe function) as a pseudo random number approximating function5 (sin x e ) ∑∑ 2 − Min x f (i, j ) = D Max − Min (Nonuniformity)
N=F
sin x e + 1 size 2 2 (Dead pixels)
f (i, j ) = B − 2 log(
(3)
(sin x e ) ∑ 2 − Min x f (i , j ) = C Max − Min
(4)
(Thermal)
f (i, j ) = 0
N=E
(6)
sin x e + 1 sin x e + 1 )σ ) sin(2π 2 2
N=
f (i, j ) = A
sin x e + 1 2
(5)
(Fixed Pattern)
sin x e + 1 f (i, j ) = 255 size 2 2 (Hot pixels)
sin x e + 1 size 2
(Shot Noise and Field Stars)
(7)
(8)
These functions use only an x-index value, but map into 2-D (i,j) using Hilbert sequencing. The number of Moffat functions needed in the final model and their coefficients is determined using a genetic algorithm (GA). Component models yield short numerical equations but of limited fidelity and not in real-time. Figure 1 depicts a Component Data Model of M51. Comparison shows good representation of the salient image features. Because M51 exhibits two bright cores, the genetic algorithm chose two Moffat functions to represent it. Comparison of the original image on the bottom left with the final model shows similarity in placement of the main cores, similar magnitude and extent of the cores, a similar number of field stars, and even hints of spiral structure.
3. FUNCTIONAL DATA MODELING 3.1 Univariate model – Turlington polynomial The Turlington polynomial yields one equation with continuous derivative per data set of the form (x − x j ) T ( x ) = y1 + m1 ( x − x1 ) + ∑ .001 ( m j − m j −1 ) log 10 1 + 10 .001 j=2 n −1
mj =
y j +1 − y j x j +1 − x j
(9)
where x and y are the original (x, y) data points and n the number of points from the original used for building the Turlington polynomial. The variable n can be either all of the points or a sub-sampled set of the original. The Turlington polynomial is built in a piecewise linear fashion one point at a time as data arrives. This lends itself to real-time construction. The logarithm term makes Turlington polynomials both orthogonal and differentiable. One drawback to the Turlington polynomial is in its use of n terms, equal to the number of points in the data set being modeled. This yields fast and streaming real-time derivatives of terms, but an exceptionally large final model6,7. 3.2 Univariate model - Eigenfunction Eigenfunctions1 approximate T(x) in compact form. T(x) models require coefficients be stored describing each data segment. The size is dependent on the data-sampling rate. Using eigenfunctions and the method of residuals, T(x) is approximated by
T ( x) ≅
n
∑A j =1
j
cos ω j x + iB j sin ω j x
(10)
In Equation (10) Aj is the amplitude of the jth eigenfunction term, Bj is the phase of the jth eigenfunction term, and ωj is 2π times the frequency f defined by the jth derived dominant Fourier frequency term. Dominant eigenfunction terms are added one at a time until correlation between T(x) and the integral function in (9) converges. Pseudo code used for creating eigenfunction models is shown in Figure 2. These smaller models are orthogonal, differentiable, and Lebesgue integrable. However, they require multiple Fourier Transforms, making them memory and computer intensive. 3.3 Multivariate functional Data Modeling A functional is a function whose variables are themselves functions. We approximate the Kolmogorov-Gabor polynomial by using Ivakhnenko’s Group Method of Data Handling8 (GMDH) using nested functionals of the form
y ( x1 , x2 ,K, xL ) = f ( y (b1 ( y (b2 (K ( y (bn ( y ( xi , x j , xk )))))))))
O[ y ( x1 , x2 , K , xL )] = 3n
(11)
This structured approach forms intermediate meta-variables from combinations of three inputs combined into a single new fused output. This fused meta-variable becomes a new input variable available at the next layer. Since the algorithm only uses inputs necessary to achieve convergence, pruning of inputs is automatic and requires no external intervention, enabling unsupervised learning under proper circumstances. This Functional Data Model is derived in near real-time like T(x) and yields a final model substantially shorter like the eigenfunction model. If derived from derivative data, it will yield a differential equation model. The algorithm for this process is listed in Reference 1. 3.4 2-D image to 1-D conversion Enabling functional modeling of images requires transformation from 2-D into 1-D without 2-D decorrelation. Several methods exist, including raster scanning and fixed pattern readout such as zigzag sequencing. However, only Hilbert sequencing preserves 2-D correlations at dyadic sample sizes. Hilbert sequencing is illustrated graphically in Figure 3 and is given by
H n +1 = w1 ( Pn ) ∪ w 2 ( Pn ) ∪ w 3 ( Pn ) ∪ w 4 ( Pn )
Pn +1 = w1 ( H n ) ∪ w 2 ( Pn ) ∪ w 3 ( Pn ) ∪ w 4 ( Pn )
(12)
subject to Lindenmayer’s L-system grammars represented and defined by L → + RF R → − LF F →
F
+ → +
− LFL + RFR
− FR + + FL −
F ⇒ ( x , y , α ) → ( x + l cos α , y + l sin α , α ) + ⇒ ( x , y ,α ) → ( x , y ,α − δ )
(13)
− ⇒ ( x , y ,α ) → ( x , y ,α + δ )
− → −
δ = 90 °
Psuedo code for generating the Hilbert sequence is readily available3.
4. APPLICATIONS 4.1 Training cases We chose a bitmap database of 110 Messier objects for use in classifier construction. Classifier flowcharts in Figures 4 and 5 will be described in Section 4.4 and 4.5. In our database, M102 (repeat of M101) was removed and replaced with NGC58669. Each bitmap is a 24-bit color image with varying sizes. Bitmaps were resized to 64x64, generating a thumbnail of each image. Thumbnails were then converted to gray scale and Hilbert sequenced into 1-D.
4.2 Turlington Data Models Turlington polynomials given in Equation (9) were used to construct Data Models of the entire image. Because Turlington polynomials are continuous functions, they can be interpolated to any resolution. Figure 6 compares the original and Turlington Data Models for M13 (cluster), M20 (nebula), and M101 (galaxy). T(x) models suppress noise by increasing the fitting parameter currently set to 0.001 in Equation (9). 4.3 Eigenfunction Data Models Eigenfunction models of Messier objects consisting of 5, 10, and 20 terms were built. Using more terms than 5 gave marginal improvement in correlation, yet doubled and quadrupled the model size. Therefore, we limited our models to 5 terms.
Figures 7-9 contain a library of eigenfunction Data Models, one for each of the 110 Messier objects. Each object is plotted next to 3 data graphs. The middle graph is the Hilbert sequenced original waveform, the bottom graph the eigenfunction model, and the top graph is a special histogram derived from the eigenfunction model that will be discussed in more detail in Section 4.4. Data Models of images with the object centrally located and somewhat symmetrical (except for open clusters) contained three dominant peaks. The width of these peaks was generally wider for extended objects. Open clusters did not show these peaks. Rather, the waveforms displayed 1/f structure5 (Figures 7-9). 4.4 Change detection Data Model The top graphs in Figures 7-9 are created by generating a double histogram with mode subtraction. First, the histogram of the data is calculated with number of bins equal to number of points. This histogram is normalized 0-255 and a new histogram (same number of bins) calculated. The mode of this histogram is removed, leaving the modified 2nd order histograms shown. Characterization of these histograms using descriptive statistics provides features for a hybrid change detector. Reference 1 describes change detection theory and descriptive features in detail. Our statistical feature classifier correctly identified 107 out of 110 objects. Figure 4 is a flowchart of the change detector, and the details are as follows:
Our first change detector (8 layer) ([O(3)8]polynomial) was constructed to identify clusters as nominal, and other objects (nebula and galaxy) as off-nominal. When this classifier was tested against cluster data, it achieved 100% correct classification. However, when galaxies and nebula were presented, 19 were mislabeled as clusters. Finally, a classifier was constructed resolving differences between clusters and mislabeled objects (galaxies and nebulae). The resolver is a 10 layer ([O(3)10]polynomial) Data Model. The classifier only post processes the data sets labeled as “clusters”. Our resolver correctly identified all nebulae, all clusters except M35, and all galaxies except M108. With clusters removed, the remaining objects are passed through a second change detector ([O(3)8]polynomial) that correctly identified all presented galaxies and all nebulae except M78 (which was mislabeled galaxy). Additional unusual and potentially difficult test cases were selected to explore performance envelopes of the classifier. We chose NGC869 (double cluster), the Large Magellanic Cloud (LMC), Small Magellanic Cloud (SMC), Comet Hale-Bopp, and Comet Neat, shown in Figure 10. These cases were presented to the change detector for assignment: Object Double Cluster LMC SMC LMC+SMC Hale-Bopp Neat
Cluster CD
Resolver
Galaxy CD
Interpretation
0.5002 0.5 0.5517 3.1 4.7 9.9
x108 -5.2 0.9558 N/A N/A N/A N/A
N/A N/A 3.1 3.4 0.5 4.0
Unlike any Messier list object Cluster Nebula Nebula Galaxy Nebula
9
x10 x1010 x109
x107 x104 x107
4.5 Stellar object classification 5-term eigenfunction models are constructed for each Messier object10,11, yielding 16 coefficients shown as a cluster plot in Figure 11 and given in tabular form in Figure 12. These features are used to build a Data Model classifier. First, we reduced the dimension of the output to two classes forming a cascade. The first classifier identifies clusters from the initial pool of objects (clusters, nebula, and galaxies). Once identified, a second classifier distinguishes between nebula and galaxies. Figure 5 is a flowchart of this classifier.
Using this approach, a 10 layer Data Model ([O(3)10]polynomial) was constructed that correctly distinguished all 57 clusters from galaxies and nebula. Next, a 2 layer Data Model ([O(3)2]polynomial) was constructed that correctly classified the remaining 40 galaxies and 13 nebulae into their proper classes. The total classifier uses both Data Models; the first one identifies clusters, while the second determines if the object is a galaxy or nebula. We obtained 100% correct classification for the Messier list of objects. Also, the 2 combined Data Models needed only 13 of the 16 available features from Figure 12. The three features not used were B(1), w(4)/2π, and B(5). Our classifier also detects novelties or changes. If the equation yielded a value outside of -0.0862< x < 1.1313 bounds, the object is flagged unique; exhibiting characteristics not observed in the Messier training set.
Figure 11 shows a graph using the best two discriminating features (A0 and A1) from Figure 12. These features were automatically selected by calculating the separability between the 3 classes of objects in all 16 dimensions of the feature space, and plotting the features that maximized the minimum cluster separability using a K-factor defined as
K=
µ 2 − µ1 σ 22 N2
+
(14)
σ 12 N1
where µi is the mean of each group, Ni is the number of points in each group, and σi is the standard deviation of each group. Unusual cases were presented to the classifier to score. Our classifier assigned them as follows: Object NGC 869 (Double Cluster) Large Magellanic Cloud (LMC) Small Magellanic Cloud (SMC) Combined LMC and SMC Comet Hale-Bopp Comet Neat
DM Value 1.5 0.2843 1.0173 3.5 0.9991 1.0269
Interpretation x1014 x109
Unlike any Messier list object Nebula Galaxy Unlike any Messier list object Cluster Cluster
These cases are unusual because they do not resemble any Messier object. The Double Cluster and combined LMC and SMC were flagged as novelties. We found our Data Model classifier can determine when new class definitions are required without supervision.
5. SUMMARY In conclusion, two approaches for modeling deep sky images was successfully demonstrated. Component based modeling allows very simple equations to be built. Functional modeling was demonstrated using two different techniques. Turlington polynomials were demonstrated for real-time applications, and eigenfunctions for short models. Very good exception handling of novel examples is exhibited using Change Detectors. Functional Data Modeling resulted in a classifier that correctly identified all 110 of the Messier objects and performed reasonably well classifying unusual objects.
ACKNOWLEDGMENTS The authors would like to thank Scott McPheeters, Tim Aden, and John Deacon for their continued support during the course of this work.
REFERENCES 1.
Jaenisch, H., Handley J., Lim A., M.D. Filipovic, White G., Hons A. , Crothers S., Deragopian G., Schneider M., Edwards M., “Data Modeling for Virtual Observatory data mining”, Proceedings of SPIE Vol. 5493 (2004). 2. Lim, A., Jaenisch, H., Handley, J., Berrevoets, C., White, G., Deragopian, G., Payne, J., Schneider, M., “Image Resolution and Performance Analysis of Webcams for Ground Based Astronomy”, Proceedings of SPIE Vol. 5489 (2004). 3. Jaenisch, H.M., Handley, J.W., Scoggins, J., Carroll, M.P., “ISIS: An IR Seeker Model Incorporating Fractal Concepts”, Proceedings of SPIE, Vol 2225 (1994). 4. Jaenisch, H.M., and Handley, J.W., “Automatic Differential Equation Data Modeling for UAV Situational Awareness”, Society for Computer Simulation, Huntsville Simulation Conference 2003, (October 2003). 5. Jaenisch, H. and Handley, J., “Data Modeling of 1/f noise sets”, Proceedings of SPIE Vol. 5114 (2003). 6. Jaenisch, H.M. and Handley, J.W., “Data Modeling for Radar Applications”, Proceedings of the IEEE Radar Conference 2003, (May 2003). 7. Jaenisch, H.M., Handley, J.W. , Faucheux, J.P., “Data Driven Differential Equation Modeling of fBm processes”, Proceedings of SPIE Vol. 5204(2003). 8. Madala, H.R., Ivakhnenko, A.G., Inductive Learning Algorithms for Complex Systems Modeling, Boca Raton, FL: CRC Press, 1994. 9. “Messier Objects”, http://www.3towers.com/messier.htm, June 1, 2004. 10. Jaenisch, H.M., Filipovic, M.D., “Classification of Jacoby Stellar Spectra Using Data Modeling”, Proceedings of SPIE Vol.4816 (2002). 11. Jaenisch, H.M., Collins, W.J., Handley, J.W., Hons, A., Filipovic, M.D., Case, C.T., Songy, C.G., “Real-time visual astronomy using image intensifiers and Data Modeling”, Proceedings of SPIE Vol. 4796 (2002).
FIGURES
Nonuniformity
Fixed Pattern Noise
M51
M51 Data Model
Thermal Noise
Shot Noise & Hot/Dead Pixels
M51 with Noise Effects
Fig. 1: Component based Image Data Model for M51, including surrounding star field and FPA noise effects.
rem initialize redim xdata(n),ydata(n),extreme(n),totaldat(n),r(n),c(n),newdata(n),newdata1(n),r1(n),c1(n) rem open x and y data array files call open_data_files(xdata,n) call open_data_files(ydata,n) rem sort x into ascending order and sort y with x to retain x associations
FOR i = 2 TO n1 - 2 IF extreme(i) > dmax THEN dmax = extreme(i) dloc = i END IF NEXT i rem construct sine and cosine representation of dloc frequency at sampling equal to original rem r1 and c1 hold eigenfunction coefficients, d the frequency term
call sort2(n,xdata,ydata)
r1(k1)=r2(dloc) c1(k1)=c2(dloc) d(k1)=dloc
rem put ydata into newdata for i=1 to n newdata(i)=ydata(i) next i
FOR i = 1 TO n newdata1(i) = newdata1(i) + CCUR(r2(dloc)) * COS(dloc * 2# * pi * (i / N)) newdata1(i) = newdata1(i) + CCUR(c2(dloc)) * SIN(dloc * 2# * pi * (i / N)) newdata1(i) = newdata1(i) / SQR(N) NEXT i
rem specify objective correlation value cobj=0.99
rem calculate residual between original and summation of all terms generated so far
rem calculate Data Modeling eigenfunction model of y(x) FOR i = 1 TO n newdata(i) = newdata(i) - newdata1(i) NEXT i
rem final result held in totaldat; initialize k1 to hold number of terms k1=0
rem add current term to previous terms
10 continue FOR i = 1 TO n totaldat(i) = totaldat(i) + newdata1(i) NEXT i
k1=k1+1 rem rem rem rem rem rem rem
generate Fourier transform of data and corresponding power spectra use linear regression to fit a line through the dB power spectra identify maxima in power spectra that occur above the linear fit pull out maxima locations in array extreme find maximum value in array extreme, and use location to extract real and imaginary Fourier terms held in r (real) and c(imaginary) for use in eigenfunction model
rem calculate correlation between original and current model (totaldat) CALL correl(totaldat,ydata,N,ycorrel) rem test to see if correlation criteria met rem test to see if maxterms criteria met
call calc_extremes(ydata,n,extreme,r,c)
IF ycorrel < cobj AND k1 < .125 * N THEN GOTO 99 END IF
rem find maximum value location ; only look at first half of data since symmetric rem ignore zeroth order location
rem output final model
n1=int(n/2) dmax = extreme(1) dloc = 1
CALL output_eigen(r1,c1,d,k1,N) END
Fig. 2: Pseudo-code for Data Modeling eigenfunction model construction.
Fig. 3: Hilbert sequence scanning maintains dyadic neighbor correlation.
START Cluster Change Detector
Cluster?
Yes
Resolver Classifier
Cluster?
No
Galaxy Change Detector
Yes Cluster
No
Galaxy? No Nebula
Fig. 4: Flowchart for Data Modeling Change Detector.
Yes
Galaxy
START Cluster Classifier
No
Cluster?
Galaxy Classifier
Galaxy?
Yes
No
Nebula
Yes Galaxy
Cluster
Fig. 5: Flowchart for Data Modeling Classifier.
Data Model (64 x 64) Data Model
Original (64 x 64) Original
Data Model (64 x 64)
Data Model
Original (64 x 64)
Original
Data Model (64 x 64)
Original (64 x 64)
Data Model (256 x 256)
Data Model (256 x 256)
Data Model
Original
Data Model (256 x 256)
Fig. 6: Comparison of Turlington Data Model output and original for M13 (top), M20 (middle), and M101 (bottom).
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
M13
M14
M15
M16
M17
M18
M19
M20
M21
M22
M23
M24
M25
M26
M27
M28
M29
M30
M31
M32
M33
M34
M35
M36
M37
M38
M39
M40
M41
M42
M43
M44
M45
M46
M47
M48
Fig. 7: Data Models for Messier objects M1-M48.
M49
M50
M51
M52
M53
M54
M55
M56
M57
M58
M59
M60
M61
M62
M63
M64
M65
M66
M67
M68
M69
M70
M71
M72
M73
M74
M75
M76
M77
M78
M79
M80
M81
M82
M83
M84
M85
M86
M87
M88
M89
M90
M91
M92
M93
M94
M95
M96
Fig. 8: Data Models for Messier objects M49-M96.
M97
M98
M99
M100
M101
M102
M103
M104
M105
M106
M107
M108
M109
M110
M7
M8
Fig. 9: Data Models for Messier objects M97-M110.
Fig. 10: Comparison of test objects: M51 (left), NGC869 (left center), LMC and SMC (right center), Comet Hale-Bopp (right top), and Comet Neat (right bottom).
G a la x y C lu s te r N e b u la
Fig. 11: Cluster plot showing distribution of Data Modeling features. Features plotted are A(0) on the x-axis and A(1) on the y-axis. Bottom plot shows Messier number designation, and top plot coded by object classification.
Fig. 12: Table of coefficients for the 5-term eigenfunction based Data Model for each Messier object.