words: different concepts are written the same (but can be represented by different images) ... occurrences of objects u
Using Relative Spatial Relationships to Improve Individual Region Recognition
Christophe Millet (ENST-CEA), Isabelle Bloch (ENST), Pierre-Alain Moëllic (CEA), Patrick Hède (CEA) DTSI / Service Cognitique Robotique et Interaction
1
Introduction A well known difficulty in text processing is the homography in words: different concepts are written the same (but can be represented by different images) bank (building) bank (seat) river bank
The same issue arises in image processing: different concepts can be pictured the same (even though they are written differently)
example: sky / water
DTSI / Service Cognitique Robotique et Interaction
2
State of the art Individual object recognition has been well studied, and learning methods give good results, but these methods can not deal with image polysemy P. Carbonetto et al[1] proposed to learn the cooccurrences of objects using a database annotated at image level F. Rossant[4] has used relative spatial relationships and fuzzy logic rules to improve musical sheets recognition We propose to use relative spatial relationships and knowledge on the studied world to do image region disambiguation
DTSI / Service Cognitique Robotique et Interaction
3
Proposed approach What we need is to:
Take into account image polysemy: a region can be given more than one label with good probabilities Compute spatial relationships (left, right, top, bottom) Know the rules that apply to the studied field Find a solution that returns the best label for each region with the constraint that all the labels on the image must be consistent with the rules
This system will be evaluated on background recognition in images (8 backgrounds: sky, water, snow, trees, grass, sand, ground, buildings)
DTSI / Service Cognitique Robotique et Interaction
4
Learning overlapping classes Features:
local edge pattern histogram (512 features) [3] 64-bins RGB color histogram
Binary probabilistic SVMs for each class:
Positive examples = learned class Negative examples = all other images
A greater weight is given to the positive class (3 empirically, but this variable should be studied) same weight
DTSI / Service Cognitique Robotique et Interaction
greater weight
5
Computing spatial relationships Angle histograms h(θ) between two regions are computed considering couple of points
Then, the percentage with which the relation is satisfied is given by
DTSI / Service Cognitique Robotique et Interaction
6
Defining the rules In our background recognition application, we have defined the following rules:
+1 = encouraged relationship -1 = unwanted relationship 0 = uninteresting relationship
DTSI / Service Cognitique Robotique et Interaction
7
Find the best solution Here, we consider an image whose regions Ri are labeled with background Bi The following consistency function is computed:
with = spatial relationship
between regions Ri and Rj
Every possible solution (Bi)i=1..N are evaluated, and we keep the solution that maximizes this function DTSI / Service Cognitique Robotique et Interaction
8
Analysis of the consistency function
with = spatial relationship
between regions Ri and Rj.
Backgrounds with higher probabilities are given more weight in the final note A couple of inconsistent backgrounds gives a negative contribution, so that it is better to discard one of them If two backgrounds agree and a third one disagrees with the two others, it will be discarded DTSI / Service Cognitique Robotique et Interaction
9
Additional tricks After the SVM classification, a list of possible backgrounds with probabilities is given. We add an « unknown region » label with a probability of 30% to each region, and backgrounds below that probability are discarded If all backgrounds are inconsistent, the best consistency obtained is 0. In that case, we keep the background whose probability is the highest With the proposed function, 'grass is top right of grass' is more consistent than 'tree is top right of grass'. To overcome this issue, the relationships are modified when considering two elements that are not in the same group: the above and below relations are stretched by the same factor so that their sum equals 100%
DTSI / Service Cognitique Robotique et Interaction
10
Examples of results The segmentation algorithm is the waterfall algorithm (mathematical morphology)
DTSI / Service Cognitique Robotique et Interaction
11
Statistics on results (1) The proposed algorithm has reduced the number of detected background Removed backgrounds are mostly false positives, but some correct classifications have been removed too
DTSI / Service Cognitique Robotique et Interaction
12
Statistics on results (2) 5 kinds of modifications:
Incorrect background => not a background Correct background => not a background Correct background => Incorrect background Incorrect background => Correct background Correct background => Another correct background
83.3 % 9.2 % 5.4 % 1.8 % 0.3 %
85.1 % are improving 14.6 % should not be here 0.3 % are not good nor bad
DTSI / Service Cognitique Robotique et Interaction
13
Perspective and conclusions Improve the individual regions recognition algorithm (using a larger database)
Study the effect of the weight in the SVM learning, and maybe replace them with 1-Class SVMs Apply spatial reasoning to a greater number of objects and find a way to learn rules automatically Introduce new relative spatial relationships: surrounds, inside, between...
DTSI / Service Cognitique Robotique et Interaction
14
References [1] P. Carbonetto, N. Freitas, K. Barnard. “A statistical model for general contextual object recognition”. In ECCV 2004, (May 2004). [2] C.-C. Chang, C.-J. Lin. LIBSVM: a library for support vector machines, (2001). Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm. [3] Y.-C. Cheng, S.-Y. Chen. “Image classification using color, texture and regions.”, Image Vision Comput., 21(9), pp. 759–776, (2003). [4] F. Rossant, I. Bloch. “A fuzzy model for optical recognition of musical scores”, Fuzzy sets and systems, 141, pp. 165–201, (2004). [5] B. Marcotegui, S. Beucher. “Fast implementation of waterfall based on graphs”. Volume 30 of Computational Imaging and Vision, pages 177–186. Springer-Verlag, Dordrecht, (2005).
DTSI / Service Cognitique Robotique et Interaction
15
Local edge pattern histogram Edges : Sobel filter + binarization
Edge orientation : Local Edge Patterns are computed Texture Histogram (512 bins)
R, G and B are quantized into 4 values each
R V
Colour Histogram (64 bins)
B
DTSI / Service Cognitique Robotique et Interaction
16
More examples
DTSI / Service Cognitique Robotique et Interaction
17