Iterative refinement of multiple targets tracking of solar events

0 downloads 0 Views 1MB Size Report
Abstract—In this paper, we combine two approaches to multiple-target tracking: .... graph and then use it to solve the maximum a posteriori problem. III. ..... Name. Definition. MOTA Multiple Object Tracking Accuracy: 1-[The ratio of the sum of all ...
2014 IEEE International Conference on Big Data

Iterative Refinement of Multiple Targets Tracking of Solar Events Dustin Kempton, Karthik Ganesan Pillai, Rafal Angryk Department of Computer Science Georgia State University P.O. Box 5060, Atlanta GA 30302-5060, USA [email protected], [email protected], [email protected]

Abstract—In this paper, we combine two approaches to multiple-target tracking: the first is a hierarchical approach to iteratively growing track fragments across gaps in detections, and the second is a network flow based optimization method for data association. We introduce a new parallel algorithm for initial track fragment formation as the base of the hierarchical approach. The network flow based optimization method is then utilized for the remaining levels of the hierarchy. This process is applied to solar data retrieved from the Heliophysics Event Knowledgebase (HEK). We compare our results to labeled data from the same, and show improvements over a non-hierarchical sequential approach.

II. MULTIPLE HYPOTHESIS TRACKING In multiple hypothesis tracking, the goal is to estimate the states of moving objects by utilizing reports received from sensors. It is the case that sensor reports do not generally contain identity information of the targets being reported on. It is also generally understood that some of the reports from the sensors may indeed be false returns, or that the probability of detection is less than one, as no computer vision system has been shown to be correct at all times. Due to the last two observations, it is incorrect to assume that sensor reports can be considered as true target observations, they must be considered only as possible target observations. Thus, the most crucial component of any algorithms used for multiple target tracking is the association of sensor reports in order to create possible target tracks [4].

I. INTRODUCTION With the launch of NASA’s Solar Dynamic Observatory (SDO) mission, solar physics researchers entered the big data era. SDO’s space telescope registers approximately 70,000 high resolution (4096 by 4096) images daily, obtaining one image every ten seconds. Instruments on the SDO space telescope record data at a high spatial resolution and time cadence, which equates to about 0.55 peta bytes of raster data per year [1]. This trend in data acquisition in solar physics is expected to grow by even larger volumes with the introduction of the ground-based Daniel K. Inouye Solar Telescope (DKIST) [2]. These large data sets generate a pioneering opportunity to observe correlations between various solar phenomena (events) that have previously gone unexplored.

A. Hypothesis Formulation as Maximum a Posteriori We begin by defining the maximum a posteriori problem as in [5] and [6] by letting be a set of detection reports. Each of these detection responses contains various information of the object, where is the spatial position, is the spatial representation of the object, is the physical appearance (e.g., color histogram), and is the time of the start of the detection. We further define a single trajectory hypothesis of one particular object as an ordered list of object observations, where . An association hypothesis is then defined as a set of single trajectory hypotheses, [5].

To facilitate the important needs of space weather monitoring (which can have vital impacts on space and air travel, power grids, GPS, and communication devices), many software modules developed by the Feature Finding Team (FFT) work continuously on the massive SDO raster data and generate object data with spatiotemporal characteristics. The object data with spatiotemporal characteristics generated by FFT software modules is directly reported to the Heliophysics Event Knowledgebase (HEK), and is available to the public online [3]. The spatiotemporal data reported in the HEK is used to accurately catalog, explore, track, and correlate these solar events, since correlation between solar events often happens frequently. To accurately capture the correlations between solar events, it is essential to develop tracking techniques that are capable of handling multiple solar event instance hypotheses concurrently.

978-1-4799-5666-1/14/$31.00 ©2014 IEEE

We now define the data association problem as maximizing the posteriori probability of given the set of observations : | | argmax



i|

i

For the final iteration of equation (1) to hold, we must assume that the likelihood probabilities are conditionally independent given the hypothesis , meaning that knowledge about the

36

Now that we have defined the likelihood function of observation, we can turn our attention to the prior probability of association hypothesis or . First, however, note that the optimization of equation 1 is difficult due to the combinatorial size of . This leads to the need to reduce the size of , which is possible due to the assumption that one object can only belong to one trajectory, i.e. . Furthermore, by assuming that the motion of each object is independent, we can rewrite equation 1 as follows: Figure 1.A. Sensor reports are associated into pure track fragments and ambiguous regions re solved later.

|

∏ Where

l

,



l

In this case, the prior of each trajectory in (2) is modeled as a Markov chain, which includes an initialization probability at its initial time step, a termination probability at its final time step , and the transition probabilities ( | ): Figure 1.B. Track fragment graph represents possible associations. A1 and A2 would be represented as multiple nodes as there are multiple detections inside these regions.

({o 0 o ,

,o })

entr (o 0 ) lin

truth of hypothesis does not change the belief in the likelihood of seeing observation .

lin

(o |o

(o |o 0 ) )

lin

exit (o

(o |o ) )

The exact representations of these probability functions are discussed in Section 4. In the next section, we discuss modeling the data association problem as a directed acyclic graph and then use it to solve the maximum a posteriori problem.

We now introduce the likelihood function of observation to better show that this is indeed a safe assumption. Here, | . This the likelihood function of observation is function is used to model cases of an observation being a true detection and the cases of an observation being a false detection. There are differing ways of modeling this behavior. For instance [5] utilizes a Bernoulli distribution and simply has a constant probability of being a false detection, whereas [6] utilizes a learned relation, between the size of and the probability of being a false or true positive, from annotated data. We feel that the assumption of a constant value for is not the best solution, as not all detections are of equal validity. As such, we chose to use a Poisson distribution for modeling this value since data annotated with identity information was not available for all event types that are of interest to us, as it was in [6]. We used the distribution

III. TRACK GRAPH AND MULTIPLE HYPOTHESIS TRACKING The problem of data association can be represented as a directed acyclic graph called a track fragment graph. In this graph, the nodes of the graph are sensor reports that may be measurements or track fragments. For example, in Fig. 1.A and 1.B, multiple sensor reports are grouped into track fragments and labeled as nodes in the graph. Edges/links that connect two nodes are created only if the nodes have the possibility of representing the same object trajectory. In order to be considered as a possible representation of the same object trajectory, the start time of the successor track fragment must be larger than the end time of the predecessor track fragment.

, where is the absolute value of change in the number of detections from the previous frame to the frame of and is the expected change, and set . Here can be calculated from the data, even with the absence of data annotated with tracked events, by partitioning the data into temporally non-overlapping sets and counting detections in each partition. In the cases from [5], [6], and our method, the knowledge of does not factor into the calculations and thus our assumption of knowledge about the truth of hypothesis not changing the belief in the likelihood of seeing observation is indeed correct.

In the track fragment graph, a hypothesis of an objects movement is a path in the graph and a global association hypothesis is a set of consistent tracks. To be consistent, no two tracks in the global association hypothesis can share a single report, . A. Hypothesis Management If it is assumed that any report, , can be associated with any report in a subsequent frame, , then for frames with reports per frame, the number of all possible tracks is

37

. This means that the optimal hypothesis is found by solving a NP hard optimization, which grows exponentially in size with the set of report frames [7]. Also note that this is made even worse when we allow association across skipped frames, or in other words, we recognize that some events may not be detected in a particular frame. By doing so, we are not just considering the next frame, but k frames at each time step, where k is the number of allowed skipped frames. This leads to track hypotheses because we are now considering frames instead of the original m frames, which adds an exponent constant to a problem that is already growing exponentially. It is now clear that any practical solutions require a reduction in the number of potential track hypotheses. A common approach used to reduce this number is to reduce the amount of reports. This is accomplished by conservatively grouping reports of consecutive frames that fulfill some constraints. The constraint that was used in our algorithm is that there is one, and only one, report that has the potential to be a match in a successive frame, based on some heuristic. In our work, intended for solar activity, we utilize the known differential rotation of the sun derived from work done in [8] coupled with the time between consecutive report of frames to perceptively reduce a search area such as displayed in Fig. 2. Then, we link only those reports that have one, and only one, report in the subsequent frame that significantly overlaps the predicted position of the report from the previous frame. In doing this, the resulting growth of the underlying data association problem can be made to grow slower than the exponential growth expressed previously.

Figure 2. A report (polygon represented as chain code) and search area for next report (quadrilateral) in subsequent frame.

each detection report, , can be processed independently of a vast majority of the other detection reports in the dataset. Specifically, the only detection reports that must be considered are those that represent a detection immediately following (in time, ) the detection report that is currently being processed and that are in the spatial neighborhood of that same event. Since we are limiting track fragment generation to only those detection reports that have one, and only one, spatial-temporal neighbor, we can ignore data dependency that would arise from the possibility of more than one report being associated with a later detection. This is due to the fact that, even if the events get added to different track fragments by two separate threads, the track fragments will be equivalent, because there was only one spatial-temporal neighbor for each event to be linked to. This allows us to simply discard those results that are duplicated by parallel execution.

Another common approach to reduce the number of potential track hypotheses is top-down pruning [9], which eliminates hypotheses that have conflicts with the best global hypothesis, and that fall outside of a window of time. In this group of constituents, non-optimal hypotheses that contain track fragments that are part of the best global solution, and which are older than some time window, are removed. This approach, however, risks prematurely eliminating correct hypotheses.

Because of this fact, we can judiciously relax or ignore some data dependencies, we are able to exploit aggressive parallelism as shown in [10]. It was shown that applications where each iteration is likely to update only as small portion of the model, are highly likely to update different parts of the model, and can tolerate “errors” introduced by executing iterations in parallel, can scale well in parallel processing environments. Similarly, “error” in the form of duplicate track fragments can be tolerated and a numerically equivalent result does not need to come from the parallel portion of an algorithm that follows these constraints as the duplicate results can be discarded.

To avoid this risk, we choose to use the graphical formulation of the track stitching problem [4]. The approach combines the reduction power of using track fragments instead of individual reports and implicitly, rather than explicitly, represents hypotheses. The hypotheses of individual tracks are implicitly represented as paths through the graph. By utilizing this, the number of track hypotheses will still grow at an exponential rate with respect to the number of track fragments generated from utilizing conservative grouping, but the number of nodes and edges in the graph grows at a linear rate with the number of tracks thereby significantly reducing the information that must be stored.

So, an algorithm for track fragment generation can be trivially implemented on a parallel random-access machine, as presented in Algorithm 1 “PRAM Track Fragment Generation”. In the algorithm each detection is assumed to be a node in a doubly-linked list with its predecessor and successor initialized to be a null pointer. It is also assumed that the nodes contain timestamps indicating the start and end of the detection, which are used to predict the location of the node at the beginning of the next time step. Furthermore, it is also assumed that the node contains information about the location, , and shape of the detection, , which are used to

B. Track Fragment Generation In order to generate a track fragment graph, individual detection reports need to be grouped into track fragments. To do so we take advantage of the fact that each detection happens in a discrete location in space and time, meaning that

38

calculate overlap between predicted location and the location of the detection being considered as a target for track fragment generation.

Algorithm1.

PRAM Track Fragment Generation

As Algorithm 1 terminates, we are left with track fragments that are assumed to be detections of a single object. We assume this because of the fact that if there was any ambiguity as to whether two detections should have been linked, we delayed this decision for a later time when more information was available. This fact results in the potential for some track fragments to be a single detection, as opposed to multiple detections linked together. However, we treat all of the results from Algorithm 1 as track fragments and not individual detections. We use a data structure that stores the head node and tail node in the linked list in order to alleviate the need to traverse the entire linked list in the track fragment every time we need to know both. This structure also updates the stored pointer to the head or tail each time the accessor is called for the particular node and the current stored detection pointer is not the current head or tail. This is a simple update, since the detections in a track fragment are linked, this can be deduced from the detection’s successor or predecessor pointers.

1

This is a sequential operation which could be as large as operations depending upon the data, but will be much smaller when tracking large numbers of objects. The largest this sequence can be is the number of frames (separate, non-overlapping time stamps) in the dataset. 2 This will require exclusive write when adding to the list.

C. Track Fragment Graph In previous sections, the descriptions of the track fragment graph have been relatively informal. In this section we will formalize the creation of this graph for using minimum cost network flow to find the optimal cost. We start this formalization by defining the following 0-1 indicator variables [7, 5, 11]: fen,i {

fex,i fi, fi

0

, starts from oi other ise

, ends at oi { 0 other ise , o immediately follo s oi in { 0 other ise , oi { 0 other ise

It should be noted that by taking the negative logarithm of | posterior probability at each trajectory hypothesis from (2), the previous formulation of a maximum a posteriori problem is transformed into the following optimization problem:

Minimize subject to and

(4) (5)

Here, we let be the number of track hypotheses then [ ] is an -dimensional vector containing the log of the posterior probability for each trajectory hypothesis. The global association hypothesis is represented by the mdimensional vector [ ] where if trajectory hypothesis and zero otherwise. In the constraint , is an matrix with if the report is included in the trajectory hypothesis and zero otherwise. Here, is the number of reports and is still the number of track hypotheses as stated before. Also note that is an n-dimensional vector of ’s. The constraint is used to ensure that a single trajectory hypothesis does not share a report with another hypothesis [4, 7, 9] .

(6) (7)

Remember here that is now a track fragment in trajectory hypothesis , and the non-overlapping constraint of track hypotheses is enforced when : fen,i ∑ f ,i fi fex,i ∑ fi,

(9)

(8)

By taking the negative log of the posterior we convert the maximization function to a minimization function as follows.

39

argmin

∑ log

∑ log

oi

i

argmin

∑( log

i

)fi log

fi

i

i



f en, 0 en, 0

∑ argmin

,



f

,

f en,i en,i

i

ex,



i,

fex,

fi,

Figure 3: This is an example of the cost-flow network with three timesteps and 9 observations

i,



f ex,i ex,i

i

(10)



to expand on earlier work in [6] by adopting a hierarchy of stages in the tracking algorithm that iteratively grows track fragments as they are passed from one stage to the next. This framework allows for the use of simple models on a majority of the detection responses and more complex models where greater ambiguity is present in the association hypothesizes.

f i i

i

Equation (9) is subject to the constraints set in equation 8, and: (11)

For example, in the first stage of the algorithm, track fragments are formed by linking detections that are highly likely to belong to the same track. To determine this we use the differential rotation of the sun, as was used in [12] and developed in [8], to predict the location of a given detection at a later time. Then, if there was one, and only one, detection that intersected our predicted location, we would consider the two detections as belonging to the same track and linked them into a track fragment. By doing this, we significantly reduced the number of track fragments that the next stage needs to consider in its maximum a posteriori formulation. As will be shown in the experimental results section, this leads to a significant speedup over feeding every detection into the second stage as a track fragment. This is both due to the fact that we process the first stage in parallel, and the fact that there is significantly less computation needed when only considering local information, as opposed to attempting to optimize a global hypothesis, as is done in the second stage.

(12)

( | )

(13) *

(14)

*Again, note that models the cases of an observation being a true detection and the cases of an observation being a false detection.

So, by following this formulation, the cost-flow network is constructed by first inserting a source node and a sink node . Then, using the set of track fragments returned from Algorithm 1, we create two nodes , for every track fragment with an edge and an associated cost to that arch of and flow . There shall also be an edge with cost and flow as well as an edge with cost and flow . Then, for every transition ( | ) , an edge is created for that has an associated cost of ( ) and flow ( ) . Again, using domain knowledge is useful for determining which ( | ) . For instance, if a track fragment starts a distance that is too far for track fragment to have traveled between the two start times, then there is no need to consider as it has no possibility of being a target for linking. This formulation is presented in [5] and is also used in [11, 7]. A depiction of what a graph based upon this formulation would look like is shown in Fig. 3.

In the second stage of the algorithm, the track fragments from the first stage are associated into longer tracks by formulating the tracking task as a maximum a posteriori problem, which has been discussed previously, where the problem is solved by a successive shortest path algorithm for minimum cost network flow. In this stage, various parameters are considered. For example, if the track fragment is sufficiently long, we utilize its previous motion vector, or the differential rotation of the sun [8] if not, to determine the search area for the next track fragment(s) to consider as candidates for linking. Furthermore, in this stage, we allow the search for candidates to consider track fragments that occur more than one time period in the future in an attempt to allow for missed detections; this is modeled in the frame skip model detailed in a later section. The details of the transition probabilities will be discussed in detail in section IV.A.

IV. TRACKING IMPLEMENTATION As has been mentioned in earlier sections, multiplehypothesis tracking suffers from the combinatorial growth of its hypothesis space. To help mitigate this problem, we chose

40

Track Beginning and Ending Heat Map

(A)

(B)

Figure 4. (A) Starting Heat map (B) Ending Heat map

The third stage is similar to the second stage, but we allow longer skipped frame gaps than in the second stage. This stage also compares the motion of two candidate track fragments if they are sufficiently long enough, in addition to everything that stage two does. We complete two iterations of this stage, with each iteration having a longer allowed frame gap than the previous stage iteration, whether it is the second stage or the first iteration of the third stage.

interest. We then calculate the enter or exit probability as: (15) Similarly for , though the exit heat map is used as opposed to the enter heat map used in . 2) Transition Probability The transition probabilities ( | ) are derived similarly to the work done in [6], where they are the similarity of the linked detection responses. We also use three independent aspects of similarity: (1) appearance, (2) frame skip, and (3) motion similarity. In addition, the transition probability is formulated differently depending upon the stage we are in. Where we have the two variants as

A. Maximum a Posteriori Parameters In previous sections we presented several parameters used to calculate the prior probability of each trajectory . We now proceed with giving these parameters meaningful values, specifically the initialization probability , the termination probability and the transition probabilities ( | ). 1) Enter and Exit Probabilities In work done by [6] it was assumed that tracks can only start and end at the edges of the viewing area, with the exception of the starting and ending frames. This, however, does not hold true with solar events, as an event can start or end at almost any area of the visible solar surface as shown in Fig. 4. As such, we must determine the enter and exit probabilities based upon the location and size of the event detections of interest. We utilize the heat maps of trajectories starting and ending, which were initially produced with results from stage 1, but have been refined with results from our trac ing algorithm as the algorithm’s accuracy improved, to calculate the probability of a trajectory starting or ending at any given pixel location. We let for either enter or exit, depending upon which we are concerned with. Then we let or the pixel with the greatest probability inside the detection of interest. Finally, we let be the number of pixels inside the detection of

( | )

( | )

|

( | )

( | )

|

|

for stage two and stage three, respectively. The following sections will discuss each term in detail. 3) Appearance Model The Appearance model is the ( | ) term in the transition probability. This model utilizes the image pixels in the minimum bounding rectangle of the last detection response of a track fragment to create histogram and the same pixels from the first detection response of a track fragment that is determined to be a candidate track fragment to create histogram . To be a candidate track fragment the second fragment must be a spatial-temporal neighbor of the end of the first fragment. The neighborhood is calculated as described in the beginning of Section IV and varies in size depending upon the parameters set for the particular stage and iteration. When the neighbors are compared the image parameters, entropy, 4th Moment (kurtosis), and uniformity are used to create histograms of the pixels that fall into the minimum

41

PDF of Hellinger distances 0.70 0.60 0.50 0.40

Same

0.30

Different

0.20 0.10

0.00 -0.50

0.00

0.50

1.00

1.50

Figure 5: The distribution of Hellinger distances when comparing the histograms of pixels inside detections of the same event instances and pixels inside detections of different event instances over all event types considered.

bounding rectangles. We chose these parameters for two reasons: the first is that they have been pre-calculated for work done in [13] and were readily available in [14]; the second reason that these three were picked over any of the others in the set of ten parameters that were available, is that we found this combination to have the greatest separation of the distribution of Hellinger distances of the pixel histograms of Hellinger and . The two distributions , , , and distances were computed by comparing the image parameter histograms of detections belonging to the same track fragment to the histogram of detections belonging to different track fragments. Once this was done, we used the normal distribution probability density functions of the distances for detections belonging to the same track fragment ( ) and detections belonging to different track fragments to calculate ( | ): ( | )

(

)

Here, is the set of skipped frames in which is calculated by a duration parameter for each event type. We allow one duration from the start of detection instance as the duration of the instance, then we search for detection instances that fall temporally between the end of the first duration plus one additional duration, and any detection found in this window of time would be considered as having no skipped frames. For each duration after that, we consider it one more skipped frame and allow up to a maximum of skipped frames, where is dependent upon the stage and iteration. 5) Motion Model In the third stage of the algorithm we allow one more term in the transition probability, and it considers the motion similarity of the track fragment of interest and its candidate matches. Similar to [6] we use the normalized movement vectors of track fragments and calculated as the mean of the frame wise movement of the detections within the track fragment and call them and respectively. We then calculate ( | ) as:

(16)

Note that is the Hellinger distance of the histograms and of the detections of interest belonging to track fragments and similar to [6]. The terms , , , and were learned from the data and are depicted in Fig. 5, where , is the “Same” curve and , is the curve labeled “Different”.

( | )





(18)

By calculating the Euclidean norm of the difference of the movement of two entities, we are comparing the similarity of the movement in the two track fragments. This comparison assumes that the motion of the detected object is not changing its direction abruptly which looks to be a safe assumption for most event types.

4) Frame Skip The term | models the skipping of frames and is used to handle missed detections. In this, is the gap between the last detection of the current track fragment of interest and the first detection in the potential candidate track fragment for matching. We define this model as an exponential model |



V. EXPERIMENTAL RESULTS We applied our approach to solar data retrieved from [3] in the range of January 2012 to December 2013. We retrieved the following event types: Active Region, Coronal Holes, Filaments, Sigmoids, Sunspots, and Emerging Flux. However, since not all event types contain labels of tracked event instances, we were only able to extensively evaluate the

(17)

42

accuracy of our tracking approach on Active Regions and Coronal Holes as these two types have tracking data present in the HEK [3] dataset as determined by the Spatial Possibilistic Clustering Algorithm (SPoCA) [15].

Table 1. Definitions of Tracking Evaluation Metrics (according to [16] and [17])

Name MOTA

A. Implementation Details In order to show the increased speed and accuracy of our iterative approach over that of a single brute force step, we implemented two versions of the multiple hypothesis tracking algorithm. The first implementation is a single iteration of the second stage of our iterative algorithm. The second stage is used, as opposed to the third, due to the fact that each track fragment fed into this single iteration is a single detection in the data and hence does not have frame wise movement vectors to compare. We allowed the same frame gap of six skipped frames as we do in the last iteration of our iterative approach so as to allow the same number of missed detections.

MT%

Definition Multiple Object Tracking Accuracy: 1-[The ratio of the sum of all errors (misses, false positive, mismatches) and the total number of objects in each frame]. Large is better. Range [0%,100%] Mostly Tracked: Percentage of the ground truth trajectories that are covered by one tracker output for more than 80% in length. Larger is better. Range [0%,100%]

(MOTA) [17] provides a more accurate description of the accuracy of the tracking algorithm. MOTA sums all the errors at each time step (misses, false positives, mismatches) and then computes the ratio of errors to total objects in all time steps, which gives the total error rate , and is the resulting tracking accuracy. This metric gives a clearer picture of what the accuracy of the algorithm is as the error at each time step is calculated.

In the second implementation, we utilize the iterative approach as described in previous sections where the single detections are fed into the first stage that conservatively links detections into track fragments with no skipped frames allowed in this stage. The results of the first stage are then fed into the second stage where we allow two skipped frames. The results of the second stage are then used as input to two iterations of the third stage, with the first allowing four skipped frames and the second allowing six skipped frames.

C. Tracking Performance We used two years of data from January 2012 to December 2013 [3]; the data was split into months and the tracking algorithms were run on each month. We ran both versions of the algorithm on a computer with a 4.01GHz AMD FX-8350 eight-core processor with 16GB of RAM running Ubuntu 14.04 LTS. We compared each month of both algorithms to the tracked results listed in the data we retrieved, and the results are listed in Figs. 6-8. In the graphs the whiskers represent the min/max of the data and the boxes are the usual second and third quartile representation.

B. Evaluation Metrics The metrics we use to evaluate the accuracy of the tracking algorithms are listed in Table 1. We took the definition of Mostly Tracked (MT) from [16], where mostly tracked is when the ground truth trajectory is covered by one trajectory from the tracking algorithm for greater than eighty percent of its length. This, however, leaves some ambiguity in its definition as to how to handle when the output of the tracking algorithm switches its identity to another ground truth trajectory for a number of frames and then switches back. In our evaluations, we have assumed that if a ground truth trajectory is covered by the same algorithm trajectory for a total number of frames that equates to eighty percent of its entire length, then we consider it mostly tracked.

The results in Fig. 6 show that the iterative approach runs an average of 6.3 times faster than utilizing a single pass for Coronal Holes and 6.4 times faster for Active Regions. This speedup is mostly due to the parallel processing of individual event reports in the first stage of the iterative algorithm, thereby reducing the number of track fragments sent to the second and third stages, which utilize a sequential algorithm to solve the underlying maximum a posteriori problem.

We believe that Multiple Object Tracking Accuracy

Figure 7. Distribution of the percentage of mostly tracked trajectories for 24 months of Coronal Hole and Active Region data run with iterative algorithm and single pass of the second stage.

Figure 6. Distribution of execution times for 24 months of Coronal Hole and Active Region data run with iterative algorithm and single pass of the second stage.

43

We also see in Figs. 7 and 8 that the iterative approach is more accurate for both Coronal Holes and Active Regions with an average MOTA of 90.2% on Coronal Holes and 88.5% for Active Regions. Some of the improved accuracy could be attributed to the addition of Motion Model for sufficiently long track fragments in the third stage of the iterative algorithm. Additionally, the improved accuracy could also be attributed to the shorter frame gap for the majority of track fragments being associated. The iterative algorithm slowly allowed larger gaps only if there was no match in a close time frame, where the single pass algorithm allowed detections up to six frames away to be considered in data association in every frame.

Figure 8. Distribution of Multiple Object Tracking Accuracy for 24 months of Coronal Hole and Active Region data run with iterative algorithm and single pass of the second stage.

VI. CONCLUSION We presented a method for tracking multiple events types of solar data retrieved from the HEK. The problem was first defined as a typical maximum a posteriori problem and then converted to a track graph that provides an efficient representation, as well as supports the development of polynomial time algorithms such as minimum cost network flow when the Markov property is satisfied.

[8] R.F. Howard et al., "Solar surface velocity fields determined for small

magnetic features," Solar Physics, vol. 130, no. 1-2, pp. 295-311, Dec. 1990.

[9] G. Castanon and L. Finn, "Multi-Target Tracklet Stitching through

Netowrk Flows," in 2011 IEEE Aerospace Conf., Big Sky, MT, 5-12 March 2011, pp. 1-7.

[10] Jiayuan Meng et al., "Exploiting the forgiving nature of applications for scalable parallel execution," in IEEE Int. Symp. on Parallel & Distributed Processing (IPDPS), Atlanta, GA, 19-23 April 2010, pp. 112.

VII. FURTURE WORKS We plan to gather human-labeled data for comparison in future works as well as attempt to develop a parallel implementation of the sequential minimum cost network flow solution to the data association problem currently used. Additional experiments are also planned to determine if the appearance model would benefit from different event types having different distributions as they utilize different wavelengths in their image parameter comparisons.

[11] H. Pirsiavash et al., "Globally-Optimal Greedy Algorithms for Tracking a Variable Number of Objects," in 2011 IEEE Conf. on Computer Vision and Pattern Recognition, Providence, RI, 20-25 June 2011, pp. 1201-1208.

[12] C. Caballero and M. Aranda, "Automatic Tracking of Active Regions

and Detection of Solar Flares in Solar EUV Images," Solar Physics, vol. 289, no. 5, pp. 1643-1661, 2014.

[13] J.M. Banda et. al, "Region-Based Querying of Solar Data Using

Descriptor Signatures," in 2013 IEEE 13th Int. Conf. on Data Mining Workshops, Dallas, TX, 7-10 Dec. 2013, pp. 1-7.

VIII. ACKNOWLEDGMENTS

[14] M.A. Schuh et al., "A large-scale solar image dataset with labeled event

This work was supported by National Aeronautics and Space Administration (NASA) grant award No. NNX11AM13A, and by National Science Foundation (NSF) grant award No. 1443061.

regions," in 2013 20th IEEE Int. Conf. on Image Processing, Melbourne, VIC, 15-18 Sept. 2013, pp. 4349-4353.

[15] C. Verbeeck et al., "The SPoCA-suite: Software for extraction,

characterization, and tracking of active regions and coronal holes on EUV images," Astronomy & Astrophysics, vol. 561, Jan. 2014.

IX. REFERENCES

[16] Yuan Li et al., "Learning to Associate: HybridBoosted Multi-Target

Tracker for Crowded Scene," in IEEE Conf. on Computer Vision and Pattern Recognition, 2009, Miami, FL, 20-25 June 2009, pp. 29532960.

[1] W. Pesnell and P. Chamberlin, "The solar dynamics observatory (sdo).," Solar Physics, vol. 275, no. 1-2, pp. 3-15, 2012.

[2] P. Martens et al., "Computer Vision for the Solar Dynamics

[17] K. Bernardin and R. Stiefelhagen, "Evaluating Multiple Object

Observatory (SDO)," Solar Physics, vol. 275, no. 1/2, pp. 79-113, 2012.

Tracking Performance: The CLEAR MOT Metrics," EURASIP Journal on Image & Video Processing, vol. 2008, no. 3, pp. 1-10, 2008.

[3] "Heliophysics events registry," [Online]. Available:

http://www.lmsal.com/isolsearch. [Accessed May 2014].

[18] H. Peters et al., "A novel sorting algorithm for many-core architectures

[4] C. Chong et al., "Efficient Multiple Hypothesis Tracking by Track

based on adaptive bitonic sort," in 2012 IEEE 26th Int. Parallel & Distributed Processing Symp., Shanghai, 21-25 May 2012, pp. 227-237.

Segment Graph," in 2009 12th Int. Conf. on Information Fusion, Seattle, WA, 6-9 July 2009, pp. 2177-2184.

[5] L. Zhang et al., "Global Data Association for Multi-Object Tracking

Using Network Flows," in 2008 IEEE Conf on Computer Vision and Pattern Recognition, Ancorage, AK, 23-28 June 2008, pp. 1-8.

[6] M. Hofmann et al., "Unified Hierarchical Multi-Object Tracking using

Global Data Association," in 2013 IEEE Int. Workshop on Perfomance Evaluation of Tracking and Surveilance, Clearwater, FL, 15-17 Jan. 2013, pp. 22-28.

[7] C. Hong, "Graph Approaches for Data Association," in 2012 15th Int.

Conf. on Information Fusion, Singapor, 9-12 July 2012, pp. 1578-1585.

44

Suggest Documents