state of the art was then examined for both high and low level vision. .... controlled, and the degree of artificial intelligence and knowledge ..... L _____ --.l.
SUPPORT TECHNOLOGIES FOR ROBOTICS VISION SYSTEMS
SUPPORT TECHNOLOGIES FOR ROBOTICS VISION SYSTEMS
By
Christopher C. Jobes, Ph. D.
Electrical & Electronic Systems
U.S . Bureau of Mines P.O. BJx 18070 Pittsburgh, PA 15236
CONTENTS
Abstract.. .. . ... .. . .. . ..... .. . . . . . .. ...... . .... . . . . . . ... ... . ... .. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Technology review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Relatior. to human vision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Approaches to computer vision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic paradigms for computer vision.. ... .. .. . . . . . . . . . . .. . . .. . Hierarchica l bottorrrup approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . Hi erarchical top-down approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . lieterarchical approac h. . . .. . . . . . . . .. . .... .. . .. . .... . .. .. ... Bl ackboard approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Levels of represertation.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Current state of t .. . --iT t . . . . . . . . ... .. .. .. ... . ... . . .. . . . . . . ..... Low level vision . .. . . . .. . _. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . High l e ve l vision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Trade study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pl ayers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Research trends in vision systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Future directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hardware and architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modeling and programming. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Knowledge acquisition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sensing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Industrial vision systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Future applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Application to mining automation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appropriate vision systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dus t. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lighting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feature extraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operation s}:BeCl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Utilization in mining automation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pose determinati on. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inspection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. -- Examples of appl ications of computer vision now under way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
5 5 6 6
6 7
7 8 8 8 8 9 9
9 10 10 10 11 11 11 12 12 12 12 13 13 13 13 14 14 14 15 15 15 16
21
ILLUSTRATIONS
1.
2. 3.
4. 5. 6. 7. 8. 9.
Frarrework for early and intermediate steps in a theory of visual information processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Truck (a ) irr:age and (b) edges detected by t he Nevita-Babu edge detector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example of a 2-1/2 D sketch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model- based interpretation of irr:ages . ........... .. ... ....... . Hierarchical bottom-up approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hierarchical toprdown approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Heterarchical approach.... . ...... . . .. . .. .... . .... .. . ... .. .... Blackboard approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Organization of a visual system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17 17 18 18 19 19 19 20 20
(This page intentionally l eft blank. )
4
SUPPORT TECHNOLOGIES FOR ROBOTICS VISION SYSTEMS .By
Christopher C. Jobes 1
AB3TRACT
This letter report presents a study on VlSlon sys l.,efl"r:j as a support technology for the Fundamental Investigation of Robotics in Mining effort. It looks at the technology of vision itself, the current vision systems trade, and the possibl e application of this technology to mining autorration. Current vision technology was explored to determine current efforts to define a relationship to human V1Slon. It was also searched to determine the different approaches to computer vision. The current state of the art was then examined for both high and low level vision. A Trade study was also perforrred to determine the players i n the computer vision arena , the current research trends i n computer V1Slon , and the future direction of research into computer vision systems. The application of these vision systems to ffilnlng autorration was then considered. Appropriate vision systems for the mining envi ronrrent were sought and an attempt was rrade to determine their utilization in mining autorration. INTRODUCT ION Computer VlSlon visual perception employi ng computers - shares with expert systems the role of being on of the rrost popular topics in artificial intelligence today. The computer V1Slon field is multifaceted, having rrany participants with diverse viewpoints . However, the field is still in the earl y stages of developrrentorganizing principles have not fully crystallized and the associated technology has not yet been completely rationalized. However, cornrrercial vision systems for inspection and guidance tasks , and other systems (see Appendix A) , are beginning to be employed in military, cartographic, and image i nterpretation applications (1) 2 . Exploring vision systems as a support technology for robotics i n A mining automation determines that several issues be examined. technology review to determine the current system's relationship to 1 Mechanical Engineer, Pittsburgh, PA.
Pi tts burgh Research Center, Bureau of Mines,
2Numbers underscored in parentheses this report. 5
are references
at the
end of
human vision, the various approaches to computer V~Slon, and the current state of the art i n computer vision should be perforrred to provide an overview of current vision system technology. A trade study to determine the players in research and development of vision systems, research trends in model-based vision systems, and future directions would present a good idea of what is currently available for use. Finally, the application of vision systems to mining automation should be examined for their relevance and their possible utilization.
TECHNOLOGY REVIEW Computer (computational or machine) vision can be defined as perception by a computer based on visual sensory input (1) . Barrow and Tenenbaum (2,) state: Vision is an information- process ing task with well-defined input and output. The input consists of arrays of brightness values, representing projections of a three dimensional scene recorded by a camera or comparable imaging device. Several input, ?r rays may provide information in several spectral bands (colc,,_ ; or from multiple viewpoints (stereo or time sequence). The desired output is a concise description of the threedimensional scene depicted in the image, the exact nature of which depends upon the goals and expectations of the observer. It generally involves a description of objects and their interrelationships , but may also include such information as the three-dimensional structures of surfaces, their physical characteristics (shape, texture, color, material) , and the locations of light and shadow s ources. Relation to Human Vision Some researchers take the view that "artificial intelligence is (or ought to be) the study of information processing problems that characteristically have their roots in some aspects of biological information processing Cl)." They developed a computational theory of vision based on their study of human vision. Figure 1 represents the transition from the raw image through the primal sketch (see figure 2) to the 2-1/2 dimensional sketch (shown in figure :3), which contains inforrration on local surface orientations, boundaries, and depths. There are strong indications that the interpretive planning areas of the hurran brain set up a context for processing the input data (4. ,5.). The brain then uses visual and other cues from t he environment to draw in past knowledge to generate an internal representation and interpretation of the scene . This knowledge-based expectation guided approach to vision is now appearing in advanced AI computer vision systems . Approaches to Computer Vision In going from a scene to an image (an array of brightness values) the image encodes much information about the scene, but the information is obscured in the single brightness value at each point. In projecting 6
onto the two- dirrensional irrage, inforIn3.tion abJut the three-dirrensional structure of the scene is lost. To decode the brightness values and recover a scene description , it is necessary to employ a priori knowledge embJdied in rrodels of the scene dOIn3.in, the illumination , and the irraging process (6,).
As shown in figure 4, computer VlSlon is an active process that uses these rrodels to interpret the sensory data . To accomrrodate the diversity of appearance found in real irragery, a high-performance, general- purpose system must embody a great deal of knowledge in its rrodels. Basic Paradigms for Computer Vision In broad terms, an irrage- understanding system starts with the array of picture-elerrent amplitudes that define the computer irrage, and using stored models (either specific or generic) determines the content, of a scene . Typically , various symbJlic features such as lines and areas are first determined from the irrage. These are then compared with similar features associated with stored models to find a In3.tch when specific objects are being sought. In rrore generic cases , it is necessary to determine various characteristics of the scene, and using generic rrodels, determine from georretric shapes and other factors (s uch as allowable relationship:; between objects) the nature of the scene content. A variety of paradigms have been proposed to accomplish these tasks in irrage- understanding systems. These paradigms are based on a comrron set of broadly defined processing and manipulation elerrents: feature extraction, symbJlic representation, and semantic interpretation. The paradigms differ primarily in how these elerrents are organized and controlled, and the degree of artificial intelligence and knowledge employed (1) . Hierarchical futtorn-- Up Approa ch Figure 5 is a block diagram of a hierarchical paradigm of an imageunderstanding system that employs a bJttorn--up approach. First , primitive features are extracted from the array of picture-elerrent intensities that constitute the observed irrage. Next, this set of features is passed on to the semantic interpretation stage, where the features are grouF€d into symbJlic representations. The resultant symbJl set of lines, regions, and so on , in combination wi th priori stored models, are then operated on (i . e. semantically interpreted) to produce an application-dependent scene description. Bottorn--up refers to the sequential processing and control operation of the system starting with the input irrage. The key to success in this approach lies in a sequential reduction in dirrensionality from stage to stage. This is vital because the relative processing complexity is generally greater at each succeeding stage. The hierarchical bJttorn--up approach can be develoF€d successfully for domains with simple scenes
7
rrade up of only a limited numter of previously known objects. Hierarchical Top--l);)wn Approach This approach (usually called hypJthesize and test), shown in figure 6 , is goal directed, the interpretation stage teing guided in i ts analysis by trial or test descriptions of a scene. An exarr~le woul d be using template rratching - rratched filtering - to search for a specific object or structure within the scene. Matched filtering i s normally performed at the pixel level by cross-correlation of an object template with an observed image field . It is often computationally advantageous, tecause of the reduced dimensionality, to perform the interpretation at a higher level in the chain by correlating image features or symbols rather than pixels . Heterarchical Approach Hierarchical image-understanding systems are norrrally designed for specific applications . Thus they tend to lack adaptability. A large amount of processing is also usually required . Often much of this processing is wasted in the generation of features and symbols not required for the analysis of a particular scene (I). A technique to avoid this problem is to establish a central monitor to observe the overall perforrrance of the image- understanding system and then issue comrrands to the various system elements to modify their operation to maximize system performance and efficiency. Figure 7 is a block diagram of an image-understanding system that achieves heterarchical operation by distributed feedback control . Blackboard Approach Another image- understanding system configuration , called the blackboard model, has been proposed (8.) . Figure 8 is a simplified representation of this approach in which the various system elements communicate with each other via a common wor.king data storage area calle d the blackboard. Whenever ar~ element performs a task, its output is put into the common data storage area, which is independently accessible by a ll other elerrents. The indi vidual elerrents can te designed to act autonomously to further the common system goal as required. The blackboard system is particularl y att ractive i n cases where s everal hypJtheses must be considered simultaneously and their components need to te kept track of at various levels of representation. Levels of Representation A computer vis ion system, l i ke human vision , is commonly considered to te naturally structured as a succession of levels of representation. A way i n which to view the organization of a general- purpose vision system (shown in figure 9) is to divide the figure into two parts (9,). The first is image oriented (iconic ) , dorrain independent, and based on the image data (data driven). The second part of the figure is symbolic, dependent on the domain and the particular goal of the vision process .
8
The first p)rtion takes the image, which consists of an intensity array of picture elements (pixels, e.g., 1000 x 1000), and converts it into image features such as edges and regions. These are then converted. into a set of parallel intrinsic images, one each for distance (range), surface orientation, reflectance, and so on. The second part of the system segments these into volumes and surfaces deJ:€ndent on the system' s knowledge of the domain and the goal of the computation. Using domain knowledge and the constraints associated with the relations a.ID:)ng objects in this domain . :)b jects are identified and the scene analyzed consistent with the s y,, "oal. Current State of the Art Human vision is the only available example of a general-purp)se vision system. However, thus far, not many AI researchers have taken an interest in the computations p:lrformed by natural vision systeI'lE , but this situation is changing. Many researchers l:::elieved that to a first approximation, the human visual system is subdivided into rrodules specializing in visual tasks (lQ). There is also evidence that people do global processing first and use it to const rain local processing. Considerable information now exists about lower-level visual processing in human beings. However, as we progress up the human visual computing hierarchy, the exact nature of the appropriate representations becomes subject to dispute. Thus overall human visual p:lrception is still very far from being understood. Low Level Vision Although methods for p)werful high-level understanding visual analysis are still in the process of being determined, insights into low-level vision are emerging. The basic physics of imaging, and the nature of constraints in vision and their use in computation are fairly well understood. D:ltailed prograIIE for vision rrodules, such as shap:l from shading (~) (a method of estimating relative di stance and surface orientation from a single image using heuristics based on the rate of change of brightness across the image) and shap:l from optical flow in an image (associated with object !lOtion), have begun to aPJ:€ar. Also , representation issues are now better understood. For low-level processing, many recent algorithIIE take the form of parallel computations involving local interactions . One popular approach having this characteristic is relaxation, in which local computations are iteratively propagated to try to extract global features (11). These locally parallel architectures are well suited to rapid parallel processing techniques now currently available. High Level Vision Although the concept of intelligence is somewhat vague, particularly when referring to a machine, it is not difficult to conceptualize 9
tehavior characterized as intelligent.
Several characteristics are:
the ability to extract pertinent information from a background of irrelevant details, the capability to learn from examples and to generalize this knowledge so that it will apply in new and different circumstances , the ability to infer facts from incomplete information , the capability to generate self-motivated goals, and to formulate plans for meeting these goals. While it i s possible to design and implement a vision system with these cr:.J.l":J.cte:..' istics in a limited environment, it is not yet known how to endow it with a range and depth of adaptive performance that comes even close to emulating human vision. Although research in biological systeIl'E is continually uncovering new and promising concepts, the state of the art in machine v~s~on is for the most part based on analytical formulations tailored to fit specific tasks (12,). TRADE STUDY Although :;; r::ecial
purpose systeIl'E have thus far teen the most vision applications are now becoming commonplace and are expanding. Vision manufacturers are now teginning to provide easier user programming , friendlier user interfaces, and systeIl'E engineering support to prospective users. Many firIl'E are now entering the industrial vision field, with technical leapfrogging teing comm:::m due to rapidly changing technology. effectiv ~ .
~uccessful
Players Rosenfeld, at the University of Maryland, issues a yearly bibliography , arranged by subject matter, related to the computer processing of pictorial information. The typical issue now covers over 1000 r eferences . These references include: research-oriented universities funded under DARPA IrnageUnderstanding Prograrno, other active universities, nonprofit organizations, the lJ . S. Government, and commercial v~s~on systeIl'E developers including industrial vision companies , large diversified manufacturers, and rooot manufacturers. Research Trends in Vis ion SysteIl'E Most research efforts in vision have teen directed at exploring various aspects of vision, or toward generating particular processing modules for a step in the vision process rather than in devising general-purpose systeIl'E . However, there have been two major U. S.
10
efforts in general- purpose systems: the ACRONYM3 system at Stanford University under the leadership of T. Binford, and the VISIONS system at the University of Massachusetts at Amhurst under A. Hanson and E. Riserran. Other research efforts in model-based vision systems have been surrm:rrized (,13,). All the research computer V1Slon systems are individuall y crafted by the developers reflecting the developers' backgrounds, interests, and dorrain requirerrents . All, except ACRONYM (and to an extent, 3-D MOSAIC) , use irrage (two- dirrensional) models and are viewpoint dependent. Models are described mostly be serrantic networks, although feature vectors are also util ized. The systems , capitalizing on their choice to limit their observations to only a few objects , use predominantly the to~down interpretation-of-irrages approach, relying heavily on prediction. Future Directions As the field of computer vision unfolds, the following future trends are expected (1).
Techniques Although most industrial vision systems have used binary representations, the increased use of gray scales can be expected because of their potential for handling scenes with cluttered backgrounds and uncontrolled lighting , and because the rremory required for this more sophisticated approach is getting much less expensive. Recent theoretical work on monocular shape interpretation from irrages (shape from shading, texture, etc . ) rrakes it appear more pro~slng that general rrechanisms for generating spatial observations from irrages will be available by the 1990s to support general vision systems. Successful techniques (such as stereo and motion parallax) for deri ving shape and/or motion from multiple irrages should also be available i n the early 1990s. The rratherratics of irrage understanding will continue to becorre more sophisticated. Enl argerrent will continue of the links now growing between irrage- understanding and theories of hunan vision. Hardware and Arc hi tecture Hardware and Software are now available that enables real- tirre operation in simple situations (such as inspection of parts). By the early 1990s , hardware and software should be availabl e for real- tirre operations for robots and other activities requiring recognition, and position and orientation inforrration. 3References to specific products Bureau of Mines. 11
do not
imply endorsement
by the
Fast raster-based pipeline preprocessing hardware to compute low- level features in local regions of an entire scene are now becoming available and should f ind general use in commercial vision systems within the next several years. As at virtually all visual levels, processing seems inherently parallel, parallel process i ng is a wa ve of the future (but not the entire answer). Relaxat ion and constraint analysis techniques are on the increase and will be increasingly reflected in future architectures . Modeling and Programming Three-dil1Bnsional m:xieling is now el1Brging, arlslng largely from c omputer a ided design/computer aided manufacturing (CAD/CAM) technology. Three-dil1Bnsional CAD/CAM data bases will be integrated with industrial vision systems to realistically generate synthesized images for matching with visual inputs. Illumination rrodels, shading, and surface property rrodels will be increasingly incorporated into visual systems . Volumetric rrodels that allow prediction and i nterpretation at the levels of volumes , rather than irrages, will s ee greater utilization. High-level vision programming languages (such as Automatix's RAIL) that can be integrated with robot and industrial manufacturing languages are now begi nning to a ppear and will becol1B comrronplace in the early 1990s. Generic representations for arrorphous objects (such as trees) have been utilized experil1Bntally and should becOI1B generally available by the end of the 1980s . Knowledge Acquisition Strategies for i ndexing into a large data base of rrodels should be available by the end of the 1980s. "Training by being told" will supplel1Bnt "training by example" as computer graphics techniques and vision programming languages beCOI1B rrore corrm:m. Sensing
An impor tant area of developrrent is three- dil1Bnsional sensing. Several c urrent industrial vision systems are already employing structured light for three-dil1Bnsional sensing. A number of new innovative techniques in this area are expected to appear by the earl y 1990s. More a c tive vision sensors, such as scanning l aEer radars , are now b.:ling explored, and shoul d find substantial a pplications by the early 1990s . Industrial Vision Systems Increased
use
of
advanced
12
vision
techniques
in industrial
vision systems will "be seen, including gray scale imagery. There is now a shortening time lag l::etween research advances and their applications in industry . It is anticipated t hat in the future this time lag may l::e as little as two years . Advanced electronics hardware at reduced cost is increasing the capabilities and speed of industrial vision systems while also reducing their cost. It is anticipated that special lighting and active sensing will play and increasing role in industrial vis ion. Cormon programming languages and improved interface standards will, within the next several years, enable easier integration of vision to robots and into the industrial environment. Future Applications It i s anticipated that about one- fourth of all industrial robots will l::e equipped with some form of vision system by the early 1990s . It is like ly that within the next decade on the order of half of all industrial inspection activities requiring vis ion wi ll l::e done with computer vision systems. New vision system applications in a wide variety of areas, as yet unexplored, will l::egin to appear. Computer V1Slon will play a large role in future military applications . APPLICATION TO MINING AUTOMATION In the industrial manufacturing arena, to circumvent the requirement that the workpiece l::e in a prescril::ed position and orientation (pose) for the robot to operate upon it, sensory systems can be empl oyed . Vision provides perhaps the most flexible approach to avoid all the fixturing that would "be required to achieve a fixed pose (1) . In the mining automation scenario, vision could playa large part in the gathering of sensory information. There are many j obs that could be automated easily if an appropriate V1S l on system were available. However, reality determines that one look at the conditions in the mine that would adversely effect a V1S lon system and then choose the l::est system based on this information. Appropriate Vision Systems When s electing a V1Slon system for an underground mine, one s hould take into consideration the e ffects of dust, lighting , the numl::er and orientations of features to be recognized, and speed of operation required for the g iven task.
Dust I t i s sad, but true, that even in a laooratory with near clean room condi tions that the present day vision systems are still not what a general purpose vision system should "be. Thus, one could hypothesize that if a vision system does not perform as well as is needed to
13
complete a given task in a clean air environment, it is even less likely to be able to perform that task in a dusty and rroist environment. The refraction, reflection, and other properties of coal dust, rock dust, and water vapor would certainly have an obscuring effect on any VlSlon system. It is difficult enough for a hwran operator, which arguably has the best general purpose vision system known to exist, to operate a machine via teleoperation in this type of environment. Lighting The re are s everal things to consider when discussing the lighting for a vision system. They are the light source(s) and the optical quality of the reflecting surface. The general purpose VlSlon system frequent ly requires a structured light source to simplify its task. In a mine , the absence of other light s ources would be a benefit if it were always true . If any other light s ource were to intrude upon the workplace , sorre of the features that t he vision system would require might be washed out due to a change i n t he contr ast of a portion of the wory-space . The optical qualities of the average piece of equiprrent or mine s urface c ould be described as nonhorrogeneous at best. Peeling paint . c oal dust , rock dust , hydraulic f l uid, grease, and a host of other c ontaminants do their best to obscure even the rrost significant features of an o bject. The average surface in a mine may be coal, shale , firecla y, or other types of rock strata and partings covered with rock dust , mushrOOI1E , mud , gob , water, piles of coal sloughed off the ribs, p iles of roof rock from minor falls , etc. To say the least, feature e xtraction i n this type of environment is frequently difficult enough for humans, therefore much rrore so for vision systems. Feature Extraction I n a mine , the number and orientation of objects to be identified vary widel y . The average mine has over a hundred pieces of equiprrent of differe nt designs , many different types of roof support scherres (e. g. cribs , posts , etc.) , and any of these may appear in any orientation at a ny tirre if the ",70rkplace is a rrobile one . By nature of the paradigllE that vision systel1E use, the fewer the features and objects to identify, the faster the computation speed. The laboratory systel1E that characterize general purpose vision still are not capable of identifying rrore t hat a dozen or so objects in real tirre . Thus the paradig!1E that may have the best chance for success in this type of environment are the heterarchical a pproach and the blackboard approach. Operation Speed The type of vision system to be chosen a lso depends G ,; t he speed of the process being rronitored. If the scene changes rapidly t hen a real t i rre processing system is required. If the scene changes slowly, a s ystem that is slower might be more appropriate and cost effective. Sorre operations in the faster category
14
would be
the use
of vision
for mobile equipment in motion , observing a continuous miner or roof Colter in operation (depending on what purpose the observation is to perform), and the monitoring of haulage equipment such as conveyors and rail cars. If the purpose of the vision system is for inspection, maintenance, or for the determination of part position and orientation in a bin for a manipulator, then a slower vision system may be of use. Utilization in Mining Automation Vision systems would be very useful in a mining environment provided that they become fast enough and robust enough. Any place that a miner is used today could conceivably utilize a vision system. The most cOIllllOn task of a vision system is to determine the pose of an object after recognizing that object i n a scene. Another task would be of inspection of mine conditions . Pose Determination The determination of pose requires ranging and orientation recognition. Range can be determined in four pri nciple ways: stereo, triangulation , active ranging (e. g. time of flight of light or sound), and optical focusing. Orientation can be determined by: observing the relationship of three (or more) known object points that are not collinear in the viewing field, provided that the relative ranges connecting these points are known , deriving surface normals using t he intrinsic image concept (2), and deducing orientation based on the response of lighting on the object utilizing known characteristics of the object (approach simplified if structured lighting such as sheets of light are used. Inspection The inspection of mine c onditions is unlike anything tried to date but it stands to reason that if humans can do it, the application of vision systems and an expert system in mine conditions and geology could approximate the required level of expertise . It would be the task of such a system to inspect roof and rib conditions, determine local geology for the mine plan, and gather information on floor conditions for the section controller to schedule cleanup.
15
REFERENCES An Introductory Intelligent Machines : 1. Gevarter, W. B. Prentice-Hall , Perspective of Artif icial Intelli gence and Roboti cs. Inc., 1985 .
2. Barrow, H. G. and J. M. Tenenbaum. Computat ional Vision. Proceedings of the IEEE, Vol. 65, No.5, May 1981, p. 573-595 . Marr, D. and H. Nishihara. Visual information processing: 3. Artificial Intelligence and the Sensorium of Sight. Technology Review, Oct. 1978 , pp. 28- 47. 4. Gevarter, W. B. A Wi ring Diagram of the Human Brain as a Model for Artificial Intelligence . Proceedings of the IEEE International Conference on Cybernetics and Society, Washington, rc, Sept. 1977 , pp. 694- 698. 5. Minsky, M. L . A Frarrework for Representing KnOWledge. The Psychology of Computer Vision, P. H. Winston (Ed. ), McGraw- Hill, 1975 , pp. 211-277. 6. Barrow, H. G. and J. M. Tenenooum. Recovering Intrinsic Scene Characteristics from Images. Hanson and Riseman, 1978 , pp. 3- 26. 7.
Pratt , W. K.
Digital Image Processing.
Wiley. 1978 ,
pp. 568-
587 . 8. Reddy, R. and A. Newell. I mage Understanding: Potential Approaches . ARPA Image Understanding Workshop, Washington rc, 1975 . Tenenooum, J. M. , et al. Prospects for Industrial Vision. 9. Computer Vision and Sensor-Based Robots. G. G. Dodd and L. Rossol (Eds . ), Plenum Press , 1979 , pp. 239- 258. 10. Hubell, D. H. and T. N. Wessel . Brain Mechanisms of Vision. Scientific American, Vol 241 , No.3, Sept. 1979 .• pp. 150-163 . 11. Cohen . Artificial 1.
P. R. and E. A. Feigenooum. Vision. Handbook of -l..gence , Vol. 2, W. Kaufmann, 1982, pp. 125-32 1.
12 . Fu, K. S., e t a l . Robotics: Intelligence. McGr2n- Hill, 1987 .
Control,
Sens ing ,
13 . Gevarter, W. B. An Overview of Computer Vision. National Bureau of Standards, Washington, rc, Sept. 1982.
16
Vision and
NSIR 82-2582,
PROCESSING MODULES
RAW IMAGE
~
PRIMA L SKETCH
r--..
DISPARITIES IN A STEREO PA IR
-..
DISPARITIES IN SUCCESSIVE IMAGES
STEREOPSIS
STRUCTURE FR OM MOTION
r--
- --,
r-~
- - .....J r - - - --, L_
I
-~ L_
-- -
I I
I
LOC AL DEPTH
SURFACE ORI ENTATiON FR OM SHADING
LOCAL IN TEN SIT Y GRADIENTS
~
LOC A L DEPTH
I
CONSTR A INT S ON LOC AL ORIENTA TlON
r----, - - 1 ~I I
L
r
-
--
- - -1
2-1/20 SKETCH
-
I
__ _
- - --, I
-< ~
I
I
L
I
.....J
--
_.J L--_
INTENSI TY RtPRESEN TATI ONS
Figure 1.
VISIB LE SUR FACE RE PRESEN TATI ONS
Framework for early and intermediate steps in a theory of visual information processing.
11 41
~~:\~.~~ .;~ . . ~i_~:';:~,i:.)~; (a) :',' . ~·.:'L:. Figure 2.
·
,~:~.~:
(b)
Truck (a) image and (b) edges detected by the Nevita- &bu edge detector. 17
......... ..... ... ..... ....
..... ··f-, .....-...
-
: -0. . ...
.
:.....~ ..~
t
t
! 1
···f····· . • .:~. . ... .• :
.. ........ ..... .......... ... .
*
~
¥
•
....
.--
.
~' ..
"
· --t: ...... ...t··· · ·.;."..... ~......,l ..... . .~.... . .....;.·····,.... .. .... ~ ..
· -. , •
~
·
~ '
. 0.
• . •..
... 0 . . . . . .
.
.
..... 0
/
~
~
/
~.
~
/
JI
. ../
.. .
. . . . .. . . . . . . .
~
•t
~
. .
t
... . y. ... . /. .. . ... .f .......,
Figure 3.
. . 0 •• ••
•
..... , ...."",.
Example of a 2-1;2 D s ketch .
:.9:
------
~~
WORLD
PHYSICS
IMAGE
SCENE DESCRIPTION
Figure 4 .
Model- based interpretation of images.
18
FEATURE EXTRACTION IMAGE
SYMBOLIC REPRESENTATION
L..:.~:'-'_ _ _ _....J
DESCRIPTION
FEATURES
Figure 5 .
Hierarchical bottom-up approa ch.
TEST SYMBOLS
FEATURE EXTRACTION IMAGE
DESCRIPTION TO SYMBOL MAPPING
SYMBOLIC REPRESENT ATION
SEMANTIC INTERPRETATION
IMAGE FEATURES
TRIAL DESCRIPTION
DESCRIPTION
IMAGE SYMBOLS
Figure 6 .
Hierarchical top-down approach. VISUAL M ODE LS
SEMANTIC INTERPRET AT IO N IMA GE
FEATURES
SYMBOLS
L _ _ _ _ _ _ -.J FEATURE CONTROL
L _____
DESCRIPTION
I --.l
SYMBOL CONTROL
I
__ ..J
L FEATURE CONTROL
Figure 7.
Heterarchical approach.
19
BLACKBOARD
Figure 8 .
Blackooard approach.
SENSOR
LOW LEVEL ICONIC DOMAIN INDEPENDENT DATA DRIVEN
"7
INSTRINSIC IMAGES (DISTANCE, ORIENTATION, REFLECTANCE . ... )
HIGH LEVEL SYMBOLIC DOMAIN SPECIFIC GOAL DRIVEN
Figure 9.
Organization of a visual system. 20
APPENDIX A EXAMPLES OF APPLICATIONS OF COMPUTER VISION NOW UNDER WAY
21
Automation of industrial process Object acquisition by robot arns (e . g. sorting or pa.cking itens 'r-riving on conveyor b=lts. Aut .. a tic guidance of seam welders and cutting tools. VLS I (very large scale integration) - related processes, such as l ead bonding, chip aligrurent, and pa.ckagi ng. Monitorlng, f iltering, and thereby containing the flood of data from oil dri ll sites or from seisrrographs. Providing visual feedback for automatic assembl y and repa.ir . Inspection tas ks Ins per, '.on of printed circuit boards for spurs, shorts , and bad mnections . Checking the results of casting processes for impurities and fractures. Screening medical images such as chrorrosome s lides, cancer smears , x-ray and ultrasound images, torrography. Routine screening of plant samples. Inspection of a l phanumerics on labels and manufactured itens . Checking pa.ckaging a nd contents in pharmaceutical and f ood industries . Inspection of glass i tens for cracks, bubbles , etc. Rerrote sens ing Cartography, automati c generation of hill-shaped maps, and registration of satellite images with terrain maps. Monitoring traffic along roads, docks, and at a i rfields. Management of land resources such as water, forestry, soil erosion, and crop growth. Detecting minera l ore deposits. Making computer power rrore accessible Management information systens that have a communi cation channel considerably wider than current systerrs that are addressed by typing or pointing. wcument readers (for those who still use pa.per) . Design aids for architects and mechanical engineers. Military applications Tra cking ID)ving objects . Automatic navigation based on pass i ve sensing. Target acquisition and range finding . Aids for the parti ally sighted Systens that read a document and speak what they read. Automatic "guide dog" navigation systens.
22