Space Object Classification Using Deep Convolutional Neural Networks

3 downloads 0 Views 667KB Size Report
tracking data on over 22,000 space objects (SOs), 1,100 of which are active, using ..... This final layer uses a fully connected neural network layer and is given by.
Space Object Classification Using Deep Convolutional Neural Networks Richard Linares

Roberto Furfaro

[email protected]

[email protected]

Assistant Professor Department of Aerospace Engineering & Mechanics University at Minnesota, Minneapolis, MN, 55403.

Associate Professor Department of Systems & Industrial Engineering University of Arizona, Tucson, AZ, 85721.

Abstract— Tracking and characterizing both active and inactive Space Objects (SOs) is required for protecting space assets. Characterizing and classifying space debris is critical to understanding the threat they may pose to active satellites and manned missions. This work examines SO classification using brightness measurements derived from electrical-optical sensors. The classification approach discussed in this work is data-driven in that it learns from data examples how to extract features and classify SOs. The classification approach is based on a deep Convolutional Neural Network (CNN) approach where a layered hierarchical architecture is used to extract features from brightness measurements. Training samples are generated from physics-based models that account for rotational dynamics and light reflection properties of SOs. The number of parameters involved in modeling SO brightness measurements make traditional estimation approaches computationally expensive. This work shows that the CNN approach can efficiently solve classification problem for this complex physical dynamical system. The performance of these strategies for SO classification is demonstrated via simulated scenarios.

I. I NTRODUCTION Space Situational Awareness (SSA) is an area of research motivated by the U.S. Air Force’s mission to protect and ensure peaceful access to space. This task involves collecting tracking data on over 22,000 space objects (SOs), 1,100 of which are active, using optical and radar sensors. Developing a detailed understanding of the SO population is a fundamental goal of SSA and reaches across all other critical tasks involved in SSA. The current SO catalog maintained by JSpOC includes simplified SO characteristics, mainly the solar radiation pressure and/or drag ballistic coefficients. This simplified description limits the dynamic propagation model used for predicting the catalog to those that assume cannon ball shapes and generic surface properties. The future SO catalog and SSA system will have to be capable of building a detailed picture of SO characteristics. The near-geosynchronous SO population is primarily tracked with optical sensors which provide astrometric and photometric measurements. Although the amount of light collected from these SOs is small, information can still be extracted from photometric data which can be used to determine shapes and other properties. Light curves are the time-varying wavelength-dependent apparent magnitude of energy (e.g. photons) scattered (reflected) off of an object along the line-of-sight to an observer. Attitude estimation and extraction of other characteristic using light curve data has been demonstrated in Ref. 1–5.

Traditional measurement sources for SO tracking, such as radar and optical, have been shown to provide information on SO characteristics. These measurements have been shown to be sensitive to shape,4], [6 attitude,4], [7], [8 angular velocity,9 and surface parameters.10], [11 The state-of-the-art, in the literature, has been advanced over the past decade and in recent years has seen the development of multiple model,4], [12 nonlinear state estimation,7–9 and full Bayesian inversion13 approaches for SO characterization. The key shortcoming of approaches in literature is their overall computational cost. This work provides a fundamentally different and novel way of studying the SO characterization problem. Classification of SOs is a key challenge in SSA and has many applications. The current state-of-the-art uses physicsbased models to process observations and then classification is performed by determining which class of models best fits the data. These models are computationally expensive to evaluate and with the increase in the number of objects being tracked, these approaches are not practical for use on the entire catalog. This work looks at a data-driven deep learning approach for classification. Reference 4 used a Multiple Model Adaptive Estimation classification approach that modeled the dynamics and physics to estimate relevant parameters and classify SOs. In general, there are a lot of unknown parameters that should be included in the estimation process, but the dimensionality in this complex problem does not lend itself to computationally efficient solutions. Therefore, for this work we investigate data-driven classification approaches that are not based on models but rather use the observations or data to detect relationships and use these relationships for classification. The data driven classification approaches used for this work is the Convolutional Neural Network Classification14 (CNN) approach. Recent advancements in deep learning15 have demonstrated ground breaking results across a number of domains. Here the term deep refers to any neural network learning approach with more than one hidden layer. Deep learning approaches mimic the function of the brain by learning nonlinear hierarchical features from data that build in abstraction.16 In an end-to-end fashion a deep learning approach is used in this work to classify SO from observational data. This paper investigates Convolutional Neural Networks (with max-pooling and dropout)16 for supervised classification of SO observational data. Although the deep learning approach

u nB I usun

uuB

u hI

I uobs

uvB

Fig. 1: Reflection Geometry

used here is trained with simulated data, once trained these models can be applied to real-data classification examples. As opposed to the traditional approaches discussed earlier, this work produces a new way of processing SO observations where quick determinations of SO classes are made possible directly from observational data. CNNs have achieved astonishing performance on general image processing tasks such as object classification,17 scene classification,18 and video classification.19 It is well known that the key enabling factor for the success of CNN architecture is the development of techniques for large scale networks, up to tens of millions of parameters, and massive labeled datasets. Encouraged by these results this paper studies the classification of observational data generated by dynamical systems. The dynamical system investigated here is rotational dynamics of SOs. The physical attributes of the SOs, such as shape and mass distribution, are also included in the classification process. The challenging aspect of the application of CNNs to physical dynamical systems is the generation of labeled training data. Here we use the physical model to simulate observations by sampling randomly from a distribution of physical attributes and dynamic states. Light curve measurements are used as inputs, and classes of the SOs are used as outputs for training the CNN approach. The CNN then learns convolutional kernels that look for characteristic features in the data. As opposed to manually specified features, the features are adaptively learned given the training data. The organization of this paper is as follows. First, the Ashikmin-Shirley light curve model is shown. Next, the development of training data is discussed. Following this the CNN approach is discussed. Additionally, results are shown for simulated examples. Finally, discussions and conclusions are provided. This paper discusses the theory involved behind the proposed algorithms and results from simulation trials are shown. II. L IGHT C URVE M ODELING There are a number of models used for simulating light curve measurements in literature and Ref. 12 provides a good summary of the most popular ones adopted for Space Objects (SOs) applications. These models differ in the physics that they represent and their level of complexity, but for

SOs applications the ability to model specular reflection and complex shapes while converting energy is desirable. The Ashikhmin-Shirley20 (AS) model has all the desirable properties while producing realistic SOs light curve. This model is based on the bidirectional reflectance distribution function (BRDF) which models light distribution scattered from the surface due to the incident light. The BRDF at any point on the surface is a function of two directions, the direction from which the light source originates and the direction from which the scattered light leaves the observed surface. The model in Ref. 20 decomposes the BRDF into a specular component and a diffuse component. The two terms sum to give the total BRDF: fr = (dRd + sRs )

(1)

which depends on the diffuse bidirectional reflectance (Rd ) and the specular bidirectional reflectance (Rs ) and the fraction of each to the total (d and s respectively where d + s = 1). Where i denotes the ith facet of the SOs. Each facet contributes independently to the brightness and total brightness is the sum over each facet’s contribution. The diffuse component represents light that is scattered equally in all directions (Lambertian) and the specular component represents light that is concentrated about some direction (mirror-like). Reference 20 develops a model for continuous arbitrary surfaces but simplifies for flat surfaces. This simplified model is employed in this work as shape models are considered to consist of a finite number of flat facets. Therefore the total observed brightness of an object becomes the sum of the contribution from each facet. In each model, T however, c = uIobs uIh , ρ is the diffuse reflectance (0 ≤ ρ ≤ 1), and F0 is the specular reflectance of the surface at normal incidence (0 ≤ F0 ≤ 1). To be used as a prediction tool for brightness and radiation pressure calculations, an important aspect of the BRDF is energy conservation. For energy to be conserved, the integral of the BRDF times cos (θr ) over all solid angles in the hemisphere with θr ≤ 90 needs to be less than unity, with Z 2π Z π/2 fr cos (θr ) sin (θr ) dθr dφ = Rd + Rs (2) 0

0

For the BRDF given in Eq. (1), this corresponds to constant values of Rd = ρd and Rs = sF0 . The remaining energy not reflected by the surface is either transmitted or absorbed. In this paper it is assumed the transmitted energy is zero. The diffuse bidirectional reflectance is then calculated as follows:  !5  IT I u (i)usun  28ρ · (1 − sF0 ) 1 − 1 − n Rd = 23π 2  !5  (3) IT I u (i)u obs  1 − 1 − n 2

where

F = F0 +



 1 5 − F0 (1 − c) s

(4)

The vectors uIsun (ti ) and uIobs (ti ) denote the unit vector from the SO to the Sun and the unit vector from the SO to the observer, respectively, and together they define the observation geometry (shown in Figure 1). In addition to d, ρ, and F0 , the Ashikhmin-Shirley BRDF has two exponential factors (nu , nv ) that define the reflectance properties of each surface. The Ashikhmin-Shirley diffuse and specular reflectivities are not constant but rather complicated functions of illumination angle, exponential factor, and the diffuse and specular reflectances. In all cases, however, Rd + Rs ≤ 1, so energy is conserved. The parameters of the Phong model that dictate the directional (local horizontal or vertical) distribution of the specular terms are nu and nv . The specular bidirectional reflectance for the AS model is given by p F (nu + 1) (nv + 1) h i (cos (α))γ (5) Rs = T I I I I 8cπ max un (i)usun , un (i)uobs where γ = nu cos2 (β) + nv sin2 (β).

III. DYNAMICS

where Csun,vis = 1062 W/m2 is the power per square meter impinging on a given object due to visible light striking the surface. If either the angle between the surface normal and the observer’s direction or the angle between the surface normal and Sun direction is greater than π/2 then there is no light reflected toward the observer. If this is the case then the fraction of visible light is set to Fsun (i) = 0. Next, the fraction of sunlight that strikes an object that is reflected must be computed:  Fsun (i)ρtotal (i)A(i) uIn (i) · uIobs Fobs (i) = (7) kdI k2

The reflected light of each facet is now used to compute the total photon flux, which is measured by an observer: # "N X Fobs (i) + vCDD (8) F˜ = i=1

where vCDD is the measurement noise associated with flux measured by a Charge Coupled Device (CCD) sensor. The total photon flux is then used to compute the apparent brightness magnitude F˜ (9) mapp = −26.7 − 2.5log10 Csun,vis

where −26.7 is the apparent magnitude of the Sun.

S PACE O BJECTS

A number of parameterizations exist to specify attitude, including Euler angles, quaternions, and Rodrigues parameters.21 This paper uses the quaternion, which is based on the Euler angle/axis parameterization. The quaternion is defined ˆ sin(ν/2), and q4 = cos(ν/2), as q ≡ [̺T q4 ]T with ̺ = e ˆ and ν are the Euler axis of rotation and rotation where e angle, respectively. Clearly, the quaternion must satisfy a unit norm constraint, qT q = 1. In terms of the quaternion, the attitude matrix is given by A(q) = ΞT (q)Ψ(q)

(10)

where 

 q4 I3×3 + [̺×] −̺T   q I − [̺×] Ψ(q) ≡ 4 3×3 T −̺ Ξ(q) ≡

with



0 [g×] ≡  g3 −g2

A. Flux Calculation The apparent magnitude of an SO is the result of sunlight reflecting off of its surfaces along the line-of-sight to an observer. First, the fraction of visible sunlight that strikes an object (and is not absorbed) is computed by  (6) Fsun (i) = Csun,vis uIn (i) · uIsun

OF

−g3 0 g1

 g2 −g1  0

(11a) (11b)

(12)

for any general 3 × 1 vector g defined such that [g×]b = g × b. The rotational dynamics are given by the coupled first order differential equations: 1 Ξ(qB )ω B 2 h I B/Ii   −1 B B − ω T + TB × J ω = JSO SO B/I srp B/I q˙ B I =

B ω˙ B/I

(13a) (13b)

B where ωB/I is the angular velocity of the SO with respect to the inertial frame, expressed in body coordinates, JSO is the inertia matrix of the SO. The vectors TB srp and T are the net torques acting on the SO due to SRP expressed in body coordinates and the control torque, respectively.

IV. S IMULATING L ABELED T RAINING DATA Labeled training data samples are generated using the light curve model discussed earlier and by sampling the parameters required to define the AS light curve model. The SO shape and surface parameter models are randomly generated and can be generically grouped into four categories: fragment, regular polygon prisms, rocket bodies, and rectangular cuboids. The regular polygon prisms are then further divided into equilateral triangular prisms, square prisms and regular hexagonal prisms. The regular polygon prisms are prisms whose ends (i.e. top and bottom) are regular shapes. The shape of a regular polygon prism is defined by the number of sides n, side length s and height h. hregular = (hmin + 0.01) + (hmax − hmin − 0.01) U [0, 1] (14a) sregular = (smin + 0.01) + (smax − smin − 0.01) U [0, 1] (14b)

The rectangular cuboids are prisms defined by two side lengths s1 and s2 as well as the height h. The moment of inertia matrix for the cuboids are given by  2  s2 + h2 0 0 mSO  0 s21 + h2 0  (16) Jcuboid = 12 0 0 s21 + s22

Models are generated by sampling side lengths and heights from a uniform distribution on the interval [0.01, 5] m. For the regular polygon prisms, the number of sides are also selected randomly on the interval [3, 6], with all instances of 5 sides being set to 4 as pentagonal prism models are not included. In addition to the model geometry, the material properties also need be defined. For each model, all facets are assumed to have the following: Rspec = 0.7, Rdiff = 0.3, ǫ = 0.5. The Phong parameters nu and nv are each taken to be equal to 1000 for all facets of every model. The mass of the SO is randomly sampled using the following mSO = mmin + (mmax − mmin )U [0, 1]. Rocket body model are generated using octant triangulation of a sphere discussed in Ref. 22 which divided the surface of a sphere into N facet normal. Then rocket body models are generated by connecting two hemisphere ends of radius r with cylinder of height l. This model is not exact for all rocket bodies but is close enough to approximate the types of light curves seen for rocket bodies.    2 Vcyl 1 1 2 2 2 2 r diag (3r + l ), (3r + l ), Jrocket = mSO Vtot 12 12 2   2 Vtop 1 r 1 + diag (3r2 + l2 ), (3r2 + l2 ), Vtot 12 12 2       Vcyl l Vtop l 3r 3r + I3×3 − eeT + − + Vtot 2 8 Vtot 2 8   Vtop 2 83 83 2 r diag , , +2 Vtot 320 320 5 (17) Where e = [0, 0, 1]T and the volume of the top hemisphere is given by Vtop = 2/3πr3 and it is assumed the bottom volume is Vbot = Vtop . The volume of the cylinder is given by Vcyl = πr2 l and the total volume is Vtot = Vtop +Vbot +Vcyl . Finally, the fragment shapes use the cuboid model but with

much small aspect ratios than payload shapes. Figure 2 shows the training data generated using the process discussed above where the labels are shown using colored data points. 20

19

18

17

Magnitude

Assuming constant density throughout the shape model, the moment of inertia matrices for each of the regular polygon models are given by   2 s h2 + 0 0 12   24 h2 s2 (15a) Jtriangle = mSO  0 0 24 + 12 2 s 0 0 12   2 s h2 + 0 0 12 12   h2 s2 (15b) Jsquare = mSO  0 0 12 + 12 2 s 0 0 6  2  5s h2 + 0 0 24 12   5s2 h2 Jhexagon = mSO  (15c) 0 0  24 + 12 5s2 0 0 24

16

15

14

13

12

11

10 0

10

20

30

40

50

60

Time (min)

Fig. 2: Labeled Training Data.

V. C ONVOLUTIONAL N EURAL N ETWORK C LASSIFICATION The training data set consists of light curve measurement vectors as inputs and class vectors as outputs. The input light curve and output class vector are denoted by x ∈ R1×m and y ∈ R1×nc , respectively, where m and nc denote the number of light curve measurements and number of classes, respectively. Then a deep neural network with convolutional layers and a fully connected output layer is trained to map from measurement vector, x, to classes, y, using a set of training examples. The CNN used for this work consists of convolutional layers with rectified linear unit (ReLU) activation, dropout, and max-pooling, and one fully connected layer with ReLU activation (Figure 3). The output layer uses softmax function to map to classification states. Each convolutional layer has the following form: hcov (x) = f (x ∗ W cov + bcov )

(18)

Where ∗ denotes the convolution operator, W cov denotes the convolution kernel, and f is the activation function for each layer that adds nonlinearity to the feature vector. This work uses ReLU for convolutional layer activation. The ReLU function is given by f (x) = max (0, x), where it is zero for negative input values and linear for positive input values. The convolution of x with W cov defines the output feature map hcov (x). The number of output maps is determined by the number of convolution filters for each convolutional layer. Each convolution layer has a collection of kernels of given size that are learned directly from the data. The sizes and number of kernels for each convolution layer used for this work are given in Figure 3. For the light curve problem the convolutions are of one dimensional time-series data. Then for input vectors having size (1, sx ) the output vector is given by (1, sy ) and the

1

0.9

0.8

0.7

Accuracy

output vector size can then be calculated from the size of the kernel, sk , and is given by sy = sx − sk + 1. After the convolution is applied the output vector is reduced in size but zero padding is used in this work to keep the size of the feature vectors constant through each convolutional layers.

0.6

0.5

0.4

0.3

G6+77>H,.6+?)!:+77! 0.2

G(=77!212"! !

0.1

0 0

9-:;!

100

200

300

Training

400

500

600

B6+?+'.!?C#DF!

Fig. 4: Classification Accuracy

"#$% &'(()!*+,,-*.-/! 9-:;! $12!?++(@,A!

and the CNN outputs, and is given by:

B6+?+'.!?C#DE! 012!34"!5-6,-(78!

L(θ) =

9-:;! $12!?++(@,A!

N 1 X ˜) H(y, y N i=1

N 1 X ˜ + (1 − y) log(1 − y ˜ )] [y log y =− N i=1

B6+?+'.!?C#DE! 2"12!34"!5-6,-(78!

(21)

Then a CNN applies a series of these kernels W cov in a layered fashion where each layer has a different size kernel that learns features on a given scale. To simplify the information in the output of each convolutional layer, maxpooling is used. In this work max-pooling with 1 × 4 kernel between each convolution layer is used. The max-pooling operation convolves the 1 × 4 kernel over the output of each convolutional layer returning the max of the outputs over the 1 × 4 region. Finally, at the final layer a nonlinear function is applied to the output using a traditional neural network. This final layer uses a fully connected neural network layer and is given by

˜ are the training examples and y are the outputs from where y the CNN. Then the CNN classification approach is trained by stochastic gradient descent by minimizing the cross-entropy loss from the outputs compared to the labelled data. Figure 3 shows the full architecture of the network used for this work. LeCun14 showed that stochastic online learning is superior against the full batch mode as it is faster and results in more accurate solutions. The weights for the output layer and the convolutional layer are updated using the following relationship ∂L (22) θ(t + 1) = θ(t) + η ∂θ where t denotes the iteration step and η is the learning rate. The ∂L ∂θ is the gradient of the loss with respect to the overall network parameters. This update is calculated for small batches over the entire training sets. Using the small batches allows for small updates to the gradient while reducing the noise in the gradient of individual training samples. This method of updating the parameters is referred to as stochastic gradient descent and the gradient is calculated with error back propagation.

hf c (x) = f W f c x + bf c

VI. S IMULATION R ESULTS

9-:;! $12!?++(@,A! B6+?+'.!?C#DE! 4"12!30$!5-6,-(78! B=.=!212%2!

Fig. 3: Network Architecture.



(19)

After the fully connected layer a softmax function is used to provide outputs in the ranges (0, 1) that add up to 1. The softmax function is defined by yj

 exp(hfj c ) h (x) = P fc i exp(hi ) fc

(20)

The convolutional kernel and fully connected output layer parameters are cased into the vector θ. The cost function used for this work is the cross-entropy loss. This loss function minimizes the cross-entropy loss between training outputs

In this section the classification approach is tested using simulated data. The CNN approach requires training data and our approach is trained over 45000 randomly generated scenarios. After training of the CNN approach it is tested against 5000 data scenarios not used in the training set. For all training scenarios, an SO is in near geosynchronous orbit with orbital elements given by a = 42, 364.17 km, e = 2.429 × 10−4 , i = 30 deg, ω = Ω = 0.0 deg and M0 = 91.065 deg. The simulation epoch is 15-March-2010 at 04:00:00 GST. The initial quaternion and angular rate of T the SO are given by qB I ≡ [0.7041 0.0199 0.0896 0.7041] B T and ωB/I = [206.26 103.13 540.41] deg/hr.

2.3

Cross-Entropy Loss

0.2

2.2

0.15

2.1

0.1

2

1.9

Kernel

0.05

1.8

0

1.7

-0.05 1.6 0

-0.1

100

200

300

Training

400

500

600

Fig. 6: Cross-Entropy Error

-0.15

-0.2 0

1

2

3

4

5

6

Time (min)

7

8

9

10

Brightness magnitude are simulated using a ground station located at 20.71◦ North, 156.26◦ West longitude and 3,058.6 m altitude. Measurements constructed using instantaneous geometry are corrupted by zero-mean Gaussian white noise with standard deviations of 0.1 for the brightness magnitude.23 Observations are available every 5 seconds for one hour (Figure 2). All training samples are generated using the same orbit during the sample simulation time interval. Each sample has different shape model, rotational initial condition, and control profile.

(a) First Convolution Layer

0.2

0.15

0.1

Kernel

0.05

VII. CNN C LASSIFICATION R ESULTS

0

-0.05

-0.1

-0.15

-0.2 0

0.5

1

1.5

2

Time (min)

2.5

3

3.5

(b) Second Convolution Layer

0.4

0.3

0.2

Kernel

0.1

0

-0.1

-0.2

-0.3

-0.4 0

0.2

0.4

0.6

0.8

1

Time (min)

1.2

1.4

(c) Thrid Convolution Layer

Fig. 5: CNN Classifications Results

1.6

Python and Tensorflow24 are used as the simulation environment for this work. The training of the CNN classification approach is computationally expensive, but it is expected that once trained on a larger dataset, this approach can outperform traditional methods while providing a computationally efficient classification model. The CNN used in the present work uses a four layer structure, the layers are given by three convolution layers followed by a fully connected layer. The first three layers use a 32, 12, and 6 unit size kernel, respectively. Both max-pooling (1×4 pooling kernel) and dropout are applied after each convolutional layer. The dropout rates used for this work are 0.7 and 0.5 for the convolutional and fully connected layers respectively. Then the training data consists of simulated light curve measurements as inputs and class states as outputs. For this study we only consider shape classes with one control class, but other classes can be added in the same CNN or with independent CNNs for each class. The classes considered are rocket bodies, controlled payload, uncontrolled payload, and debris. Figure 5 shows the CNN classification kernel features estimated during the training stage. From Figure 5(a), 5(b), and 5(c) it can be seen that the CNN approach learns light curve features that are relevant to the classification of the class considered. The first filter has the shape of a derivative type operator with large emphasis on the center location. The CNN learns that for this dataset the best low level operation is this derivative type operator. The higher layers (layer 2 and 3) have complex kernel shapes (as seen by Figures5(b) and 5(c)) that don’t lend themselves to clear interpretations. Some noise is seen in the learned kernels but this effect

can possibly be reduced with larger training set and longer training durations, this will be looked at for future work. These filters are also learn for this particular selection of classes (5 class), it may be possible to learn more general features if more classes are considered and for future work larger number of classes will be consider. The CNN approach reached a overall accuracy of 99.6% correct classification on the set of 5000 samples. Figure 6 shows the cross-entropy loss as a function of gradient descent iteration. From this figure, it can be seen that for the CNN architecture used the loss value has converged to a steady state value. VIII. C ONCLUSION In this paper a data-driven classification approach based on the Convolutional Neural Network (CNN) scheme is used for Space Object (SO) classification using light curve data. The classification approach determines the shape class from light curve observations of a SO. These classes are rocket bodies, payload, and debris. CNNs are a feature learning and classification architecture. The convolutional layers are built on a set of filters that extract local features. This work uses three convolutional layers with 32, 12, and 6 unit size kernels each. This data-driven CNN approach has the benefit of having a simpler implementation that does not require modeling of the date. For the shape classification problem the CNN approach reached an overall accuracy of 99.6% correct classification. This work showed that CNNs offer an accurate and computationally efficient way of classifying SOs. R EFERENCES [1] Linares, R., Jah, M. K., Leve, F. A., Crassidis, J. L., and Kelecy, T., “Astrometric and Photometric Data Fusion For Inactive Space Object Feature Estimation,” Proceedings of the International Astronautical Federation, Cape Town, South Africa, Sept. 2011, Paper ID: 11340. [2] Linares, R., Jah, M. K., and Crassidis, J. L., “Inactive Space Object Shape Estimation via Astrometric And Photometric Data Fusion,” AAS/AIAA Space Flight Mechanics Meeting, No. AAS Paper 2012117, Charleston, SC, Jan.-Feb 2012. [3] Linares, R., Jah, M. K., and Crassidis, J. L., “Space Object Area-ToMass Ratio Estimation Using Multiple Model Approaches,” Advances in the Astronautical Sciences, Vol. 144, 2012, pp. 55–72. [4] Linares, R., Jah, M. K., Crassidis, J. L., and Nebelecky, C. K., “Space Object Shape Characterization and Tracking Using Light Curve and Angles Data,” Journal of Guidance, Control, and Dynamics, Vol. 37, No. 1, 2013, pp. 13–25. [5] Hinks, J. C., Linares, R., and Crassidis, J. L., “Attitude Observability from Light Curve Measurements,” AIAA Guidance, Navigation, and Control (GNC) Conference, No. 10.2514/6.2013-5005, AIAA, Boston, MA, August 2013. [6] Hall, D., Calef, B., Knox, K., Bolden, M., and Kervin, P., “Separating Attitude and Shape Effects for Non-resolved Objects,” The 2007 AMOS Technical Conference Proceedings, 2007, pp. 464–475. [7] Jah, M. and Madler, R., “Satellite Characterization: Angles and Light Curve Data Fusion for Spacecraft State and Parameter Estimation,” Proceedings of the Advanced Maui Optical and Space Surveillance Technologies Conference, Vol. 49, Wailea, Maui, HI, Sept. 2007, Paper E49. [8] Holzinger, M. J., Alfriend, K. T., Wetterer, C. J., Luu, K. K., Sabol, C., and Hamada, K., “Photometric attitude estimation for agile space objects with shape uncertainty,” Journal of Guidance, Control, and Dynamics, Vol. 37, No. 3, 2014, pp. 921–932. [9] Linares, R., Shoemaker, M., Walker, A., Mehta, P. M., Palmer, D. M., Thompson, D. C., Koller, J., and Crassidis, J. L., “Photometric Data from Non-Resolved Objects for Space Object Characterization and Improved Atmospheric Modeling,” Advanced Maui Optical and Space Surveillance Technologies Conference, Vol. 1, 2013, p. 32.

[10] Linares, R., Jah, M. K., Crassidis, J. L., Leve, F. A., and Kelecy, T., “Astrometric and photometric data fusion for inactive space object feature estimation,” Proceedings of 62nd International Astronautical Congress, International Astronautical Federation, Vol. 3, 2011, pp. 2289–2305. [11] Gaylor, D. and Anderson, J., “Use of Hierarchical Mixtures of Experts to Detect Resident Space Object Attitude,” Advanced Maui Optical and Space Surveillance Technologies Conference, Vol. 1, 2014, p. 70. [12] Wetterer, C. J., Linares, R., Crassidis, J. L., Kelecy, T. M., Ziebart, M. K., Jah, M. K., and Cefola, P. J., “Refining space object radiation pressure modeling with bidirectional reflectance distribution functions,” Journal of Guidance, Control, and Dynamics, Vol. 37, No. 1, 2013, pp. 185–196. [13] Linares, R. and Crassidis, J. L., “Resident Space Object Shape Inversion via Adaptive Hamiltonian Markov Chain Monte Carlo,” AAS/AIAA Space Flight Mechanics Meeting, No. AAS Paper 2016514, Napa, CA, Feb 2016. [14] LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P., “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, Vol. 86, No. 11, 1998, pp. 2278–2324. [15] LeCun, Y., Bengio, Y., and Hinton, G., “Deep learning,” Nature, Vol. 521, No. 7553, 2015, pp. 436–444. [16] Lee, H., Pham, P., Largman, Y., and Ng, A. Y., “Unsupervised feature learning for audio classification using convolutional deep belief networks,” Advances in neural information processing systems, 2009, pp. 1096–1104. [17] Krizhevsky, A., Sutskever, I., and Hinton, G. E., “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, 2012, pp. 1097–1105. [18] Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., and Oliva, A., “Learning deep features for scene recognition using places database,” Advances in neural information processing systems, 2014, pp. 487– 495. [19] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., and Fei-Fei, L., “Large-scale video classification with convolutional neural networks,” Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2014, pp. 1725–1732. [20] Ashikmin, M. and Shirley, P., “An Anisotropic Phong Light Reflection Model,” Tech. Rep. UUCS-00-014, University of Utah, Salt Lake City, UT, 2000. [21] Shuster, M. D., “A Survey of Attitude Representations,” Journal of the Astronautical Sciences, Vol. 41, No. 4, Oct.-Dec. 1993, pp. 439–517. [22] Kaasalainen, M. and Torppa, J., “Optimization methods for asteroid lightcurve inversion: I. shape determination,” Icarus, Vol. 153, No. 1, 2001, pp. 24–36. [23] Hall, D. T., Africano, J. L., Lambert, J. V., and Kervin, P. W., “Time-Resolved I-Band Photometry of Calibration Spheres and NaK Droplets,” Journal of Spacecraft and Rockets, Vol. 44, No. 4, July 2007, pp. 910–919. [24] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., et al., “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems,” arXiv preprint arXiv:1603.04467, 2016.