Detecting and counting objects in images

0 downloads 0 Views 438KB Size Report
Detecting and counting objects in images. Frank Nielsen. [email protected]. 2012. Assume we are given a digital photo showing a collection of coins (as ...
Detecting and counting objects in images Frank Nielsen [email protected]

2012 Assume we are given a digital photo showing a collection of coins (as in Figure 1) that we would like to automatically count. Our first task would be to segment the background from the coins (foreground), and then to detect and count the connected components (coins).

1

Connected components

For now, let us postpone the foreground/background segmentation task, and consider we are given a binary (black & white) image. Figure 2(a) shows such a binary image filled with non-overlapping black disks modeling purposely coins. Each disk is detected as a connected component, and colored with a different color in Figure 2(b). In this image example, there are exactly 38 such disk objects.

1.1

Basic classes: Color and Pixel

To manipulate pixels and colors in 2D array pictures, we consider the following Color and Pixel classes: Listing 1: Color and Pixel classes class Color { i n t R, G, B ; // Red , Green and Blue channels Color ( int r , int g , int b) { R=r ;G=g ; B=b ; } boolean sameColor ( C o l o r c ) { return (R==c .R) && (G==c .G) && (B==c . B) ; } } class Pixel { int x , y ; P i x e l ( i n t X, i n t Y) { t h i s . x=X; t h i s . y=Y; } void s e t C o l o r (PPM img , C o l o r c ) { img . b [ y ] [ x]= c . B ; img . r [ y ] [ x]= c .R; img . g [ y ] [ x]= c .G; } }

1.2

Detecting C4-connected components

A pixel p(x, y) is said to be connected to a neighbor pixel p(x0 , y 0 ) if and only if they both have the foreground color attribute and are adjacent: |x − x0 | + |y − y 0 | = 1. The last condition describes the 4-connectivity of 1

Figure 1: How to automatically count the number of coins?

(a)

(b)

Figure 2: Object counting: Counting the number of black disks (a) by detecting the connected components (b).

2

image pixels (also called the C4-connectedness property). The algorithm to detect the foreground connected components in a binary image proceeds as follows: At the beginning each pixel p is associated to its own singleton region R(p) = {p}. We then raster scan the image by sweeping iteratively the pixels from leftto-right and bottom-to-top order. At a current pixel position p(x, y), we consider the two neighbor pixels p(x+1, y) and p(x, y +1) not yet visited, and check whether they have the foreground color or not. Whenever we have foreground neighbor pixels, we merge their corresponding regions. Since at any time all regions are disjoint, we need a so-called disjoint-set data-structure that implements efficiently the merging operation using a method called Union. The raster-scan algorithm is reported below: Listing 2: Detecting connected components // Connect foreground (= not background ) pixels s t a t i c D i s j o i n t S e t c o n n e c t (PPM img , C o l o r background ) { i n t n=img . h e i g h t ∗img . width ; D i s j o i n t S e t ds =new D i s j o i n t S e t ( n ) ; P i x e l p=new P i x e l ( 0 , 0 ) ; // pixels are labeled using the raster scan f o r ( i n t k =0; knull 4->null 5->null 6->null 7->null 8->null 9->null

0: 1: 2: 3: 4: 5: 6: 7: 8: 9:

0->1->8->null 1->8->null 2->null 3->null 4->null 5->8->null 6->null 7->null 8->null 9->null

0: 1: 2: 3: 4: 5: 6: 7: 8: 9:

0->1->8->null 1->8->null 2->null 3->4->9->null 4->9->null 5->8->null 6->null 7->null 8->null 9->null

0: 1: 2: 3:

0->1->8->null 1->8->null 2->6->7->null 3->4->9->null

7

4: 5: 6: 7: 8: 9:

4->9->null 5->8->null 6->7->null 7->null 8->null 9->null

For example, observe that for the set {0, 1, 5, 8}, the four linked-lists all end up with the last element being 8, the leading representative element of the set. We describe next an optimally efficient disjoint-set data-structure.

3

Disjoint-set data-structure using forests

The basic idea is to represent each set by a connected tree data structure. The tree may be not binary, and furthermore is implicitly encoded in an array of elements (int [] parent) pointing to their parent for efficiency. At the very beginning when we create the disjoint-set structure for n elements, we associate each element to its index (an integer), and initialize the height of the tree in another array: rank.

3.1

Union by rank merging

When we merge two disjoint sets, we first retrieve the root of both trees using a Find primitive, and then choose to attach the shallowest tree to the other tree in order to keep balanced trees as much as possible. This is the union by rank strategy. Overall, the union by rank disjoint-set data-structure is implemented as follows: Listing 8: A class for manipulating disjoint sets using forests class DisjointSet { i n t [ ] rank ; int [ ] parent ; // Create a disjoint set ( DS ) data - structure for n elements D i s j o i n t S e t ( int n) { int k ; p a r e n t=new i n t [ n ] ; rank=new i n t [ n ] ; f o r ( k = 0 ; k < n ; k++){ parent [ k ] = k; rank [ k ] = 0 ; } } // Find the leader element i n t Find ( i n t k ) { while ( p a r e n t [ k ] ! = k ) { k=p a r e n t [ k ] ; } return k ; } // Merge two disjoint sets indexed by x and y void Union ( i n t x , i n t y ) { // Find the representative elements

8

x=Find ( x ) ; y=Find ( y ) ; i f ( x != y ) { i f ( rank [ x ] > rank [ y ] ) { p a r e n t [ y]=x ; } else { p a r e n t [ x]=y ; i f ( rank [ x]==rank [ y ] ) rank [ y]++; } } } }

3.2

Displaying the forest state

To convert into a String the forest, we implement the toString() method as follows: public String toString() { String result=""; for(int i=0;i