This paper presents a real-time active color vision system designed for a ..... F. W.
Billmeyer and M. Saltzman, Principles of Color Technology, Wiley, New York,.
A Real-time Color Recognition Technique Carol W. Wong Lockheed Engineering and Sciences Company 2400 NASA Road 1, C-33 Houston, Texas 77058
Summary This paper presents a real-time active color vision system designed for a mobile robot testbed. The color vision system is based on a color recognition technique that is insensitive to changes in viewing parameters such as viewing angle and distance. This color vision system is implemented entirely from non-custom hardware. In our experiment, the task of the vision system is to nd four persons, who are wearing a red, blue, purple, and green solid color sweaters, respectively. The vision system is able to consistently locate each person by discriminating a target color from the other three target colors and the environment. Our experiment shows that the vision system can perform color recognition successfully with target distance from 3 to 15 feet and a horizontal eld of view of 58 degrees. In addition, this paper describes how a target color is specied as a color region in the color component space based on the density distribution function of each color component.
Introduction In recent years there has been increasing interest in the use of color vision to enhance visual skills for mobile robots.1;2 In addition to pictorial information, color vision provides spatial information that can simplify problems related to image segmentation and scene interpretation. Most color matching or recognition techniques work well within a limited domain but deteriorate signicantly with relatively small changes in parameters such as viewing angle, distance, and intensity of illumination.3;5 A mobile robot that interacts in a dynamic environment requires visual skills that are view invariant and insensitive to minor changes in lighting. Despite progress in color constancy algorithms and illumination invariant descriptions,6;7 there have been few discussions on their feasibilities for real-time color recognition. This paper presents a view invariant color recognition technique that performs consistently despite changes in viewing parameters. This color recognition technique describes the color of an object by approximating the chromaticity coordinates that relate only to the chromaticness of a color perception and are, therefore, insensitive to changes in brightness introduced by the variations in viewing distance or viewing angle. This color recognition technique can be used to develop visual skills such as tracking, image segmentation, landmark detection, and object recognition. A real-time active color vision system based on this robust color recognition technique is implemented entirely from non-custom hardware and is currently mounted on a mobile robot testbed. In our experiment, the task of the vision system is to nd four persons, who are wearing a red, blue, purple, and green solid color sweaters, respectively. The vision system is able to consistently locate each person by discriminating a target color from the other three target colors and the 1
environment. This paper is organized as follows. Section 2 provides an overview of the design of the color vision system. Section 3 discusses the view invariant color recognition technique. Section 4 illustrates how the view invariant color recognition technique is utilized in an active vision system to perform target searching and tracking in real-time. Finally, conclusions are presented in section 5.
System overview The color vision system (Figure 1) is a VMEbus-based system consisting of 4 major modules: a host processor, a mass storage device, an image processing module, and a multi-axis motion controller. A 68040 MPU board is used as the host processor of the vision system. The mass storage device includes a 100 Mbytes hard disk and a 3.5" oppy disk drive. The image processing module consists of two Datacube boards, a Digicolor board for data acquisition and display of color video signals, and a MaxVideo 200 board for image processing. A color CCD camera is mounted on a pan/tilt head. The multiaxis motion controller provides azimuth and elevation control of the pan/tilt head. The color camera output video signal is in Y/C (or S-Video) format. The Y/C video signal is converted into RGB component signals and digitized by the Digicolor board. The size of each digitized component signal is 512(H) by 480(V). Each channel of digital video data is transmitted from the Digicolor board to the MaxVideo 200 image processor via a 10-MByte per second digital bus.
2
Figure 1: The block diagram of the real-time active vision system.
A view invariant color recognition technique The design of the view invariant color recognition technique described in this paper draws on the Commission Internationale del'Eclariage (CIE) colorimetric system
8;9
which
provides a orderly description and specication of color, an essential part of solving color vision problems. In the CIE system, a color is described by the chromaticity coordinates which can be calculated from the spectral reectance curve. While it is dicult to implement the CIE system in a real-time vision system, it is possible to approximate the chromaticity coordinates using the RGB component signals. Here, two normalized color components (NCC) are dened by approximating two chromaticity coordinates, respectively. The NCC values relate only to the chromaticness of a color perception, therefore, and are insensitive to changes in brightness or lightness introduced by the variations in viewing distance or viewing angle. The NCC values of each pixel in an image frame are computed and are compared to a pair of target NCC values. The color3
matching results of every pixel in an image frame are represented by a binary image with 1 indicating a positive match and 0 indicating otherwise. The binary image is then convolved with a local averaging lter to eliminate any isolated pixel areas.
Normalized color components The chromaticity of a color is denable by its chromaticity coordinates as the ratios of each tristimulus value of the color to their sum, where tristimulus values of a color are the amounts of the three primary colors required to give, by additive mixture, a match with the color being considered. The limitation on the choice of primary colors in colormatching is that none of the primary colors can be matched by an additive mixture of the other two. The chromaticity coordinates,
, are dened by the equations
r g b
r
b
R
G
B
R
R
G
B
= + + = + + are the tristimulus values corresponding to the three primary colors, g
where , , and
= + + G
R
G
B
B
R
G
B
red, green, and blue, respectively. Since the sum of the chromaticity coordinates is 1, only two of the three coordinates are needed to describe a color. The NCC values, ^ and r
^ are dened by approximating the chromaticity coordinates and as follows:
g
r
r
^=
RH
^=
GH
g
4
g
=
1
+ + ( ) is a linear quantizer that maps a oating point value H
Q
R
G
B
Q :
1
R
+G+B
is large enough, there is a one-to-one correspondence between
to a n-bit value. If n
and ^ ^. Thus, ^ and
r g
r g
r
^ are scaled approximations of the chromaticity coordinates and , respectively.
g
r
g
For real-time implementation, the NCC values are computed for each pixel of each image frame at frame rate using the MaxVideo 200. , , and R
G
B
are 8-bit values,
H
can be
obtained by mapping ( + + ), a 10-bit value, to a 8-bit value using a lookup table R
G
B
(LUT).
Color matching The NCC values of each pixel in an image frame are compared to a pair of target NCC values,
ro
and , using the following criteria: go
pi
pi
j
j
81 > > > < => > :
if (1 ; ) ^ ro
go
r
< ri
(1 ; ) ^ g
< gi
j
j
< ro
< go
(1 + ) and r
(1 + ) g
0 otherwise. is the pixel value at location ( ) of the resultant binary image. A value of 1 i j
indicates a positive match and a value of 0 indicates otherwise. ^ and ^ are the NCC ri
values of the pixel at location ( ) of an image frame. Ideally, i j
gi
j
r
and
j
g
are chosen
such that (1 ; ), (1 + ), (1 ; ), and (1 + ) together dene a rectangular ro
r
ro
r
go
g
go
g
NCC region corresponding to a group of colors that are visually indistinquishable from the color dened by
ro go
. MacAdam's experiments on just-noticeable color dierences10;11
show that a region of undistinquishable colors represented in chromaticity coordinates is not rectangular in shape, rather, it has the shape of an ellipse. Thus, 5
r
and
g
may
be chosen such that the corresponding rectangular region is the best approximation of the MacAdam ellipse of the color being considered. In practice,
and
r
g
are chosen
according to the requirement of an application. For example, in applications like image segmentation, the goal is to separate an image into regions of a few dierent colors, choosing too small a value for The selection of
ro gr r
, and
r
g
or
g
may result in over segmentation of that image.
is further discussed in the next section.
Real-time color target tracking A real-time color target tracking skill for a mobile robot based on the color recognition technique described in section 3 has been developed. In our experiment, the goal is to track the position of a person who is wearing a specic color sweater. This task is accomplished in two steps: (1) color specic active searching, and (2) color specic active tracking, both of which require real-time coordination and control of the pan/tilt camera head.
How to specify a target color In this experiment, four solid color sweaters of dierent colors are used as targets. For each target color, a color region is dened by a set of parameters, f
ro go r g
g as de-
scribed in the previous section. These parameters are obtained via direct measurements. Since NCC is insensitive to the viewing angle and distance between the camera and the target, careful alignment is not required. The NCC values of each pixel within a region of interest (ROI) in an image are measured (Figure 2)
ro
and
go
of a target color are
determined based on the distribution of NCC measurements. Figures 2b and 2c show 6
Figure 2: (a) shows an example image frame. The NCC values of each pixel within the ROI are measured. The density distribution histogram of ^ and ^ are shown in (b) and (c), respectively. r
7
g
the NCC density distribution histograms of the ROI. The distribution histograms resemble the shape of a Gaussian function. Ideally, the NCC distribution histograms of a single color target should be a delta function. In practice, there are variations in the color related to the texture of the target surface, the impurity of the color, and the nonuniformity of illumination. The NCC value at which the peak density occurs is chosen as the target NCC value. The distance between the two NCC values with half the l
maximum density shown in each distribution histogram gives the upper limit for g
r
or
. Specically, ro
(1 + ) ; (1 ; )
lr
go
(1 + ) ; (1 ; )
lg
r
ro
g
r
go
g
Thus, r
2
g
2
lr ro
lg go
In order to discriminate between the four target colors, and are selected such that the r
g
corresponding color region does not overlap with that of any other target color. Figure 3 shows the four color regions in NCC space.
Color speci c active searching The specications of each target color as described above are stored in a LUT such that color matching (section 3.2) can be performed on every pixel of an image in real-time (at frame rate). The MaxVideo 200 board provides the LUT with multiple banks that 8
Figure 3: The color regions of four target colors in NCC space allows the storage of the specications of up to four target colors. In the searching operation, a target color is specied by activating the corresponding bank of the LUT. The color matching process produces a binary image, an 1 indicates a positive match, or a 0 indicates otherwise. Since the target is a solid color sweater, positively matched pixels should form a blob of a signicant size. Arbitrarily isolated pixels are considered as noise (or false detection) and are eliminated by local averaging ltering. Then, the color target is extracted from the ltered image via a blobbing process. The size (in pixels) of the blob is compared to a threshold to ensure the detected object is signicant. In this experiment, the blob must consist of at least 10 pixels. Figure 4 shows an example input image and the color recognition result. The input image (4a) consists of two persons, dressing in blue and purple. The specied target color is purple. The resultant image of the color recognition process (4b) shows a blob corresponding to the purple sweater. 9
Figure 4: (a) the input image, (b) color recognition result.
10
While the color matching process is underway, the vision system directs the camera head to pan at a constant velocity such that the whole environment can be searched. In case of single color tracking, if the color target is found, the panning motion is suspended and the tracking operation is initiated. The vision system can be programmed to search for multiple target colors in a sequential manner. Switching target color is achieved by activating a dierent bank of the LUT and can be easily accomplished during the vertical blanking period.
Active tracking: head and eye coordination When a target color blob is identied, the approximated location of the centroid (
x y
)
of that blob is computed. The goal is to keep the target within the camera eld of view at all times. This is accomplished by coordinating the pan/tilt movement of the camera head with the target centroid location in real-time. The (
x y
) coordinates are compared
to the boundaries corresponding to a 128(H) by 120(V) window located at the center of the 512(H) by 480(V) image for each image frame. The camera head remains idle if (
x y
) is inside the specied window, otherwise, an appropriate pan or tilt movement of
the camera head, which would cause the centroid location to move towards the center of the image, is initiated. Occasionally, the vision system can lose track of the target if the target moves faster than the camera head movement. If that happens, the vision system is programmed to repeat the search operation as described earlier.
11
Results In our experiment, the vision system is mounted on a mobile robot. The task of the vision system, in conjunction with the mobile robot, is to seek out four persons, who are wearing a red, blue, purple, and green solid color sweaters, respectively. The visual skills provided by the vision system are integrated with the robot motion via an intelligent software architecture12 that can recongure the behavior of the robot in response to environmental changes. The experiment environment is a laboratory with typical oce furniture, workstations, mobile robots, and other laboratory equipment. A 6mm lens is used in the experiment. The horizontal and vertical elds of view of the camera are approximately 58 and 45 degrees, respectively. The vision system is consistently able to locate each person by discriminating a target color from the other three target colors and the environment. This experiment shows that the vision system can perform color recognition successfully with target distance from 3 to 15 feet. The color recognition technique is eective for the complete eld of view of the camera. The vision system is able to detect a color target as soon as it enters the camera eld of view. The vision system can be congured to recognize any color and extract the corresponding color target from the background providing the color is distinctive from the environment. If the tracking target moves to an area where the target color is indistinctive from the background, the intelligent software is responsible for invoking new strategies to accomplish the task. There is limitation in this color recognition technique due to the denition of the NCC. Since the NCC values relate only to the chromaticness of a perceiving color and are insensitive to changes in intensity, NCC is not eective in discriminating colors with wide spectra. For example, a white target and a black target may have similar NCC 12
values and, therefore, cannot be distinguished without using additional feature such as their intensity values.
Conclusion This paper presented a real-time active color vision system designed for a mobile robot. In particular, a robust view invariant color recognition technique was described. This color recognition technique is based on the NCC measurements of each pixel in an image. NCC values relate only to the chromaticness of color perception and are, therefore, insensitive to changes in brightness or lightness introduced by the variations in viewing distance or viewing angle. A real-time color target tracking skill for a mobile robot based on this technique has been implemented and tested. Experimental results have demonstrated that, using NCC measurements alone, the vision system can consistently distinguish single color targets of various colors from the background providing the color is distinctive from the environment. The color recognition technique based on NCC measurements provide an essential component for developing color visual skills. Other real-time visual skills that are under development include multiple color tracking, active color stereo vision, and landmark detection.
13
1.
C. Thorpe, M. H. Hebert, T. Kanade, and S. A. Shafer, \Vision and navigation for the Carnegie-Mellon Navlab," IEEE Transactions on Pattern Analysis and Machine Intelligence, 10, 362-372 (1988).
2.
H. Mori and M. Sano, \A guide dog robot harundu-5-following a person," in Proceedings, IEEE/RSJ International Workshop on Intelligent Robots and Systems,
397-402 (1991). 3.
M. Nargao, T. Matsuyama, and Y. Ikeda, \Region extraction and shape analysis in aerial photographs," Computer Graphics and Image Processing, 10, 195-223 (1979).
4.
T. Binford, \Survey of model-based image analysis systems," The International Journal of Robotics Research, 1, 18-64 (1982).
5.
M. Swain and D. Ballard, \Color indexing," International Journal of Computer Vision, 7, 11-32 (1991).
6.
G. Healy and D. Slater, \Using illumination invariant color histogram descriptor for recognition," in Proceedings, 1994 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 355-360 (1994).
7.
G. Healy, S. Shafer, and L. Wol, Physics-Based Vision: Principles and Practice, COLOR, Jones and Bartlett, Boston, 1992.
8.
G. Wyszecki, Color Sciences: concepts and methods, quantitative data and formulae, Wiley, New York, 1982.
9.
F. W. Billmeyer and M. Saltzman, Principles of Color Technology, Wiley, New York, 1981. 14
10.
D. L. MacAdam, \Visual sensitivities to color dierences in daylight," Journal Optical Society of America, 32, 237 (1942).
11.
D. L. MacAdam, \The graphical representation of small color dierences," Journal Optical Society of America, 33, 675 (1943).
12.
R. P. Bonasso and D. Kortenkamp, \An intelligent agent architecture in which to pursue robot learning," in Proceedings of the MLC-COLT '94 Robot Learning Workshop, 21-28 (1994).
15