Three Dimensional Model Building in Computer ... - Semantic Scholar

4 downloads 212 Views 5MB Size Report
A schematic diagram for the pinhole camera model. ... 17 S3 and the corresponding probability distributions ... 19 The di erent normal groups of S3.
Three Dimensional Model Building in Computer Vision with Orthodontic Applications Elsayed E. Hemayed Sameh M. Yamany Aly A. Farag

TR-CVIP 96 Nov. 1996

Contents

1 Introduction 2 Stereo Vision

2.1 The Perspective Projection Process : : : : : : : 2.1.1 The Pinhole Camera Model : : : : : : : 2.1.2 The Perspective Transformation Matrix 2.2 Camera Calibration : : : : : : : : : : : : : : : 2.2.1 Estimating Matrix T : : : : : : : : : : : 2.2.2 Estimating Camera Parameters from T 2.2.3 Practical Approach : : : : : : : : : : : : 2.3 3D Model From Stereo : : : : : : : : : : : : : : 2.3.1 MPG : : : : : : : : : : : : : : : : : : : 2.3.2 The Rule-Based Stereo Vision System : 2.3.3 The Stereo Vision Integrated Approach

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

2 2

: 3 : 3 : 3 : 7 : 7 : 8 : 9 : 9 : 10 : 12 : 13

3 3D Model from Laser Scanned Data 4 Orthodontic Applications

14 15

4.1 3D model building of a jaw : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15 4.2 3D Tooth Separation and Recognition : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 18

List of Figures 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Overview of the 3D model builder. : : : : : : : : : : : : : : : : : : : : : : : : A schematic diagram for the pinhole camera model. : : : : : : : : : : : : : : A simple case of the perspective projection process. : : : : : : : : : : : : : : : The perspective projection in a general world coordinates system : : : : : : : The perspective projection in a general coordinates system; world and image. A general case of the perspective projection process. : : : : : : : : : : : : : : Camera Calibration (Practical Approach) : : : : : : : : : : : : : : : : : : : : Block Diagram of MPG Algorithm. : : : : : : : : : : : : : : : : : : : : : : : : Block diagram of building the 3D model from a scanner output. : : : : : : : : The current method of casting the jaw impression. : : : : : : : : : : : : : : : An Overview of The Dentist Station. : : : : : : : : : : : : : : : : : : : : : : : Building the 3D model of a jaw from a scanner output. : : : : : : : : : : : : : model of the jaw, (left)wire frame, (right) rendered model : : : : : : : : : : : Examples of normal groups : : : : : : : : : : : : : : : : : : : : : : : : : : : : S2 and the corresponding probability distribution : : : : : : : : : : : : : : : :  Sjaw and Sjaw ::::::::::::::::::::::::::::::::::: S3 and the corresponding probability distributions : : : : : : : : : : : : : : : The processed jaw after segmentation : : : : : : : : : : : : : : : : : : : : : : The di erent normal groups of S3 : : : : : : : : : : : : : : : : : : : : : : : : Tooth separation using the probability distribution of S5 : : : : : : : : : : : : 3D tooth separation and recognition : : : : : : : : : : : : : : : : : : : : : : :

1

: : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : :

2 3 4 5 6 7 10 11 14 15 16 17 18 19 20 21 22 23 24 25 26

1 Introduction The 3D model builder ( gure 1) consists of three phases: Data Aquisation, Data Preprocessing, and Surface Reconstruction. The data aquisation phase provides the computer with information about the physical object. The input to this phase can come from four di erent scenarios: stereo vision, shape from shading, 3D laser digitizer or Computerized Tomography (CT). The data preprocessing phase is incorporated in each technique, to facilitate the process of surface reconstruction. In a stereo vision system, features from a sequence of images are extracted and used in the surface tting phase. Shape from shading estimates the depth of the image pixels based on the grey level of these pixels. The data obtained from the laser digitizer contains redundant information that has to be eliminated. The CT slices are segmented to mark the object that is needed to be reconstructed. The third phase in the 3D model builder is to t a surface to the processed data. This phase is known in the computer vision eld as trianglization or surface tting. However, in the shape from shading technique, one may use multiple views for the same object, to get a complete description of the surface. In order to get a 3D model for the whole object, di erent views of the object are registered. This process of registration can be applied to the images or to the 3D model of each view.

3D Model Builder Stereovision system

Shape from Shading

Feature Extraction

Depth Estimation

Feature Matching Depth Map Surface Fitting

Registeration of Multiple Views

Physical Object

3D Model 3D Laser Digitizer

Data Reduction

Trianglization

Computerized Tomography (CT)

Image Segmentation

Contour Extract. & Surface Fitting

Data Aquisation

Data Surface Preprocessing Reconstruction

Figure 1: Overview of the 3D model builder. In this report, we present two approaches of model building, stereo vision and 3D laser digitizer (scanner). Two sections of the report are dedicated to discuss the two approaches, and a section about orthodontic application is provided. Finally, some references are mentioned.

2 Stereo Vision One way in which humans perceive depth is through a process called binocular stereopasis or stereo vision. Stereo vision uses the images viewed by each eye to recover depth information in a scene. A point in the scene is projected into di erent locations in each eye, where the di erence between the two locations is called 2

the disparity. Using geometric relationships between the eyes and the computed disparity value, the depth of the scene point can be calculated. Stereo vision, as used in computer systems, is similar. Before discussing stereo vision itself, 3D model from stereo, we describe how the imaging process works in a camera which is referred as perspective projection. Camera calibration that yields the parameters involved in the perspective projection, is also described. Finally, 3D model from stereo is presented.

2.1 The Perspective Projection Process 2.1.1 The Pinhole Camera Model

This model treats the camera lens as a pinhole at a distance of f, the focal length, along the optical axis from the center of the image plane. The line that is drawn through the image plane's center and is perpendicular to the image plane is referred to as the optical axis A. The camera's 3D location is denoted by the vector C , which represents the world coordinates of the camera's focal point (in gure 2, C is the origin O of the world coordinates) . The three axes or world coordinates are denoted by X; Y; and Z while the image plane has two axes denoted by U and V . Figure 2 shows a schematic diagram for the pinhole camera. Y

V

X

U

f Image Plane Center Image Point (u,v)

Object Point (x,y,z)

A: Optical Axis C: Optical Center

c

M Z

m

Image (Sensor) Plane Typical size is 5.0x3.75 cm

Focal Length determines the view angle (i.e., control the zooming)

Camera Lens is treated as a pinhole

Figure 2: A schematic diagram for the pinhole camera model.

2.1.2 The Perspective Transformation Matrix

The perspective projection matrix is the matrix T that transforms a point M in world coordinates (x; y; z ) to the corresponding point m in the image plane (u; v) where:

m ~ = T~ M~

(1)

In order to simplify the perspective projection process we'll start with a simple case where the optical center C coincides with the origin O of the world coordinates, and the optical axis A is along the Z -axis direction as shown in gure 3.

3

Y

V

X

U

~R

f Image Plane Center

c m

Image Point (u,v)

~r

A: Optical Axis

Object Point (x,y,z) M

C: Optical Center

Z

θ

Figure 3: A simple case of the perspective projection process. From gure 3: then In component form:

~r = Cm = fsec R~ = CM = zsec

(2) (3)

~ sec = f~r = Rz

(4)

u=x f z v y f=z

(5) (6)

The previous equations can be rewritten in a matrix form as follows:

2 4 In matrix notation:

3 2 32 3 u0 f 0 0 0 6 xy 7 v0 5 = 4 0 f 0 0 5 64 z 75 w 0 0 1 0 1 m ~ = P~ M~

Case1: Changing the world coordinates system

(7) (8)

In the simple case, we assumed a special location and orientation of the world coordinates which is not the real case. Therefore, in the current case, we assumed general world coordinates as follows: 1. The optical center C is not at the origin of world coordinates but at C = (xC ; yC ; zC ). 2. The world coordinates are not parallel to the image coordinates. Then to get the perspective projection matrix, we transform the general world coordinates to get the simple case and then perform the perspective projection as we did in the simple case. This process can be stated as : 1. Rotate the world coordinates to be parallel to the image coordinates, 2. Translate the world coordinates to the optical center C , 4

3. Perform the perspective projection. In matrix notation:

m ~ = P~ T~3 R~ M~

(9)

where T3 depends on the optical center C , and R depends on the relative orientation of image coordinates and world coordinates.

3 2 1 0 0 ?xc 6 yc 77 T~3 = 64 00 10 01 ? ?zc 5 2 6 R~ = 64

0 0 0

(10)

1

3 Ux Uy Uz 0 77 Vx Vy Vz 0 Ax Ay Az 0 5 0 0 0 1 2 3 fUx fUy fUz ?fxc P~new = P~ T~3 R~ = 4 fVx fVy fVz ?fyc 5 Ax Ay Az ?zc

(11) (12)

V U

Object Point (x,y,z)

f Image Plane Center Image Point (u,v)

Optical Axis C: Optical Center (x c,yc, zc) Z

c m

O

M A

Y

X

Figure 4: The perspective projection in a general world coordinates system

Case2: Changing the image coordinates system

In the simple case, we assumed a special location of the image coordinates which is not the real case. In addition, the image coordinates should be in terms of pixels. Therefore, in the current case, we assumed general image coordinates as follows: 1. The projection of the optical center C is not at the origin of the image coordinates but at c = (uc; vc). 2. The image coordinates have di erent units on both axes (1=ku ; 1=kv ). Then to get the perspective projection matrix, we transform the general world coordinates to get the simple case and then perform the perspective projection as we did in the simple case. This process can be stated as : 1. Perform the perspective projection. 2. Translate the image coordinates to c. 5

3. Scale the point coordinates. In matrix notation: (13) m ~ = S~ T~2P~ T~3 R~ M~ where T2 depends on the image plane center c, and S depends on the sampling intervals (1=ku ; 1=kv ). 2 3 1=ku 0 0 S~ = 4 0 1=kv 0 5 (14) 0 0 1 2 3 1 0 uc T~2 = 4 0 0 vc 5 (15) 0 0 1 3 2 fUx=ku + uc Ax fUy =ku + ucAx fUz =ku + uc Ax ?fxc=ku ? uczc P~new = S~ T~2 P~ T~3R~ = 4 fVx =kv + vc Ax fVy =kv + vc Ax fVz =kv + vc Ax ?fyc=kv ? vc zc 5 (16)

Ax

Ay

?zc

Az

Object Point (x,y,z)

V

f Image Plane Center

c m

Image Point (u,v)

M

Optical Axis C: Optical Center

A

U Z

o

O

Y

X

Figure 5: The perspective projection in a general coordinates system; world and image.

The perspective projection matrix (general case)

Figure 6 shows the general setup of perspective projection process where both image and world coordinates are general. Therefore, the perspective projection matrix T can be written in a general form as follows:

2 4

In matrix notation: where 1. 2. 3. 4.

3 2 32 i0 T11 T12 T13 T14 6 xy j 0 5 = 4 T21 T22 T23 T24 5 64 z w T31 T32 T33 T34 1 m ~ = P~ M~

i = i0 =w and j = j 0 =w. M~ = (x; y; z ) is the world coordinates of an object point M . m ~ = (i; j ) is the pixel location of the projected point m. T~ is 4x3 matrix called the perspective projection matrix. 6

3 77 5

(17) (18)

Object Point (x,y,z)

V

f Image Plane Center

c m

Image Point (i,j)

M

Optical Axis C: Optical Center

A

U Z

o

O

Y

X

Figure 6: A general case of the perspective projection process.

2.2 Camera Calibration

The camera calibration process is mainly two stages: 1. Estimating matrix T (12 Unknowns). 2. Estimating the intrinsic parameters(S; T ) and the extrinsic parameters (R; T ) from T .

2.2.1 Estimating Matrix T

As we have 12 unknowns (Tij ) then we need at least six scene points whose 3D world coordinates are known and whose corresponding image coordinates can be found. Thus a total of 12 equations will be generated. They have the form: + T12 ym + T13 zm + T14 (19) im = TT11 xxm + T y +T z +T m 32 m 33 m 34 + T22 ym + T23 zm + T24 jm = TT21 xxm + T32 ym + T33 zm + T34 31 m 31

(20)

To create a solution from this set of equations, we arbitrarily set T34 = 1 (this can be thought of as merely a scaling factor). Therefore, we can rewrite Eqs. (19) and (20) as:

T11 xm + T12 ym + T13 zm + T14 + T31 (?im xm ) + T32 (?im ym) + T33 (?imzm ) = im T21 xm + T22 ym + T23 zm + T24 + T31 (?imxm ) + T32 (?im ym ) + T33 (?im zm) = jm

7

(21) (22)

Thus, given N correspondence we can write the constraints produced from Eqs. (21) and (22) in matrix form:

2 66 66 66 66 66 66 66 66 4

x1 y1 z1 1 0 0 0 0 0 0 0 x 1 y 1 z1 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xN yN zN 1 0 0 0 0 0 0 0 xN yN zN

A shorter form is:

0 1

: : : : : : :

0 1

3 2 ?i1x1 ?i1 y1 ?i1z1 ?j1 x1 ?j1 y1 ?j1 z1 77 66 : : : 77 66 : : : 77 66 : : : 77 66 : : : 77 X 66 : : : 77 66 : : : 77 66 : : : 77 66 ?iN xN ?iN yN ?iN zN 5 4 ?jN xN ?jN yN ?jN zN AU = B ;

T11 T12 T13 T14 T21 T22 T23 T24 T31 T32 T33

3 2 77 666 77 66 77 66 77 66 77 = 66 77 66 77 66 77 66 5 64

i1 j1 i2 j2 : : : : : : iN jN

3 77 77 77 77 77 77 77 77 77 5

(23)

(24)

where A is the 2Nx11 matrix shown before, U is the vector of Unknowns and B is the vector of known pixels coordinates. If N is greater or equal to six then an optimal solution can be determined. Otherwise, an optimization technique is applied to get the solution that minimizes the squared error. This solution has the following form: U = (AT A)?1 AT B (25) T The solution exists only if A A is invertible, which is the case when A has linearly independent columns. Consequently, the world points (xi ; yi ; zi ), that are used in estimating T ,should not lie on a single line.

2.2.2 Estimating Camera Parameters from T

Using the matrix T, the camera parameters can be calculated. the relationships between the elements of T and the parameters can be shown to be: i 0 Ax T11 = (f=ku?) CUx + A

i 0 Ay T12 = (f=ku?) CUy + A i0 Az T13 = (f=ku?) CUz + A

i0 C  A T14 = ? (f=ku ) C?CU + A T21 = (f=kv?) CVx +A j0 Ax T22 = (f=kv?) CVy +A j0 Ay T23 = (f=kv?) CVz +A j0 Az

j0 C  A T24 = ? (f=ku ) C?CV + A 8

T31 = ?CAx A T32 = ?CAy A T33 = ?CAz A T34 = 1 These equations have all been scaled by T34 . In order to recover the parameters from the above equations, a method developed by Ganapathy [2] can be used.

2.2.3 Practical Approach

The practical approach of camera calibration can be summarized as follow: 1. The camera to be calibrated acquires a picture of the calibration pattern. 2. A polygonal approximation of the contours is computed, the corners are automatically extracted from this approximation, and their pixel coordinates are computed. 3. Solve U = (AT A)?1 AT B for Tij 0 s Figure 7 shows four images of camera calibration. The rst image shows the setup used in calibrating two cameras. The other three images show the calibration pattern, the approximated polygons and some of the extracted corners that can be used in calibrating the camera.

2.3 3D Model From Stereo

The idea behind the stereo vision process is that if two objects are separated in depth from a viewer, then the relative positions of their images will di er in the two eyes. The process of stereo vision, in essence, measures this di erence in relative positions, called the disparity, and uses it to compute depth information for surfaces in the scene. Furthermore, a disparity map can be obtained by applying the stereo vision process all over the images. This disparity map can be used as a representation of the surfaces in the scene. However, a 3D model representation is obtained by tting a surface to the disparity map. Having the 3D model, shaded images can be generated for di erent views and a rapid prototype can be obtained for better visualization. In general, the stereo algorithms include the following steps: 1. Features are located in each of the two images independently. 2. Features from one image are matched with features from the other image. 3. The disparity between features is computed and used as a representative to the depth which has to be computed at every point in the scene. The resulting depth points are interpolated to obtain a surface. Many algorithms had been implemented based on this common approach. These algorithms di er in two essential issues. The rst issue is the type of features they used: high level, low level, or hybrid features. The second issue is the disambiguation of multiple matches. Researchers have used di erent types of constraints in order to select the correct match of a feature from among the candidate matches. In this report, three di erent stereo systems are chosen (other stereo systems will be considered later), MPG [3], Tanaka and Kak [4], and Ho and Ahuja [6]. These systems are considered milestones in the stereo eld. They are chosen as representative systems for di erent techniques in stereo algorithms. These three techniques are discussed in the following paragraphs.

9

Setup

Calibration Pattern

Approximated polygons

Extracted Corners

Figure 7: Camera Calibration (Practical Approach)

2.3.1 MPG

In 1981, W. Grimson presented an implementation of the human stereo vision theory that was developed by Marr and Poggio (1979). This implementation is de ned as the Marr-Poggio-Grimson algorithm, (MPG). The features used by MPG algorithm are the individual edge points (it is known as zero-crossings) which are low level features. The MPG algorithm (Fig. 8) includes ve steps: 1. LOG Filtering: The left and right images are each ltered with masks of four sizes that increase with eccentricity; the shape of these masks is given by 52G, the Laplacian of a Gaussian function. 2. Extracting Zero Crossing: Zero crossing in the ltered images are found along horizontal scan lines. 3. Matching Zero Crossing: For each mask size, matching takes place between zero crossings of the same sign and roughly the same orientation in the two images, for a range of disparities up to about the width of the mask's central region. 4. Vergence Control: The output of the wide masks is used to bring the small masks into correspondence. 5. The 2 21 D Sketch: The disparities of the nest channel are stored in a dynamic bu er, called the 2 21 D Sketch. 10

Left Eye

Channel 1 w = 35

Extract zero−crossings and their attributes

Channel 2 w = 17

Extract zero−crossings and their attributes

Channel 3 w=9

Extract zero−crossings and their attributes

Channel 4 w=4

Extract zero−crossings and their attributes

Channel 1 w = 35

Extract zero−crossings and their attributes

Channel 2 w = 17

Extract zero−crossings and their attributes

Channel 3 w=9

Extract zero−crossings and their attributes

Channel 4 w=4

Extract zero−crossings and their attributes

Matching vergence Matching

vergence

Right Eye

(1) Feature Extraction

Matching vergence Matching

(2) Matching

Disparity Map

Figure 8: Block Diagram of MPG Algorithm. More details about MPG algorithm are provided in the following paragraph.

Step 1: LOG Filtering:

The LOG is a smoothed second derivative of the image signal. The LOG operator assumes the following form:  x2 + y2   ?(x2 + y2 )  2 5 G(x; y) = 2 ? 2 exp (26) 22

where 52 is the Laplacian 52 = (2 =x2 ) + (2 =y2) and G(x; y) is the Gaussian function, which acts as a low-pass- lter to the image:  2 2 2 G(x; y) =  exp ?(x2+2 y ) (27)

Step 2: Extracting Zero Crossing:

The detection of zero crossings can be accomplished by scanning the convolved image horizontally for adjacent elements of opposite sign, or for three horizontally adjacent elements, where the middle one is zero. The other two contains convolution values of opposite sign. This gives the position of zero crossing to within an image element. The sign is considered positive if the sign of the convolution values changes from negative to positive from left to right while the sign is considered negative if the sign changes from positive to negative. In addition, a rough estimate of the local orientation of the zero crossings are recorded in increments of 30. The orientation is computed as the direction of the gradient of the convolution values.

Step 3: Matching Zero Crossing:

11

Assuming that the average disparity, dav , in the image is known, then the following steps are performed to nd matches for a left-image zero-crossing point, p = (xl ; yl ): 1. Search on the yl scan line of the right image in a 1D window W2D centered at the point (xl + dav ; yl ) for the possible candidate zero-crossing matches (Fig. 4). 2. If a zero-crossing on the window is of the same sign and orientation of the left image zero-crossing in question, then this zero-crossing produces a match. The di erence between the two x locations (xl ? xr ) which is de ned as disparity, is stored in a dynamic bu er. 3. Based on the matching process, the left-image zero-crossing will be marked as: Unique match, if only one right image zero-crossing is matched with the left one. Multiple matches, if more than one matches are found. No match, if no match is found. The Vergence control will handle this case later. The case of multiple matches is disambiguated by scanning a neighborhood about the point in question, and recording the sign of the disparity of the unambiguous matches (unique match) within that neighborhood. If the ambiguous point has a potential match of the same sign as the dominant type within the neighborhood, then that is chosen as the match. In Grimson's implementation, for the W2D = 9 channel a neighborhood of size 25x25 is examined, and if less than 70% of the points in this region have matches, the region is considered to be out of the range for fusion in this channel and no disparity values are accepted for that region.

Step 4: Vergence Control:

As a result of performing more low-pass- ltering in the coarser channels than in the ner channels, small variations in the intensity disappear in the coarser channels so that very accurate disparity calculations are not possible. In a ner channel, the positions of the zero-crossings will be more accurate since not as much smearing of the intensity variations occurs. However, a correct value of dav for the point in question is more critical for ner channels because the search window is smaller than the one used by the coarser channel. This leads to the concept of vergence control, which is the concept of using the disparities from the coarser channels in calculating the average disparity in a neighborhood of the left-image point in question, and then this is used to shift the search window to the appropriate position in the right image for matching to occur. Step 5: The 2 21 D Sketch: The disparity of the nest channel calculated for each matched pair of zero-crossings is stored in a bu er commonly called a disparity map, or the 2 21 D sketch. If there exists more than one match at a zero-crossing point, the average of the corresponding disparities is stored. Results of the MPG system are provided in the appendix.

2.3.2 The Rule-Based Stereo Vision System

Tanaka and Kak [4] present a hierarchical stereo vision algorithm that produces a dense disparity map. The rule-based algorithm combines the low level based processing, e.g. zero-crossings, and the high level based processing, e.g. straight line segments and planar patches. An overview of the algorithm can be presented as follows: 1. Straight Line Matching Straight line segments of length S pixels are extracted from each image and represented by a chain code. The lines from left and right images are matched based on their initial points and the similarity between their chain codes. 2. Application of Geometric Constraints: A 16 pixel wide region, around the matched lines, is tted to a plane of orientation  where  2( nite set of assumed orientation). The planar strip () that leads to the largest number of zero-crossing matches is regarded as the best t and the corresponding disparity values are stored. 3. Curve Segment Matching: Zero-crossing contours of length Z are extracted from each image and represented by a chain code. These contours are matched in a similar way as the straight lines. Then another matching process is applied to the individual zero crossing points that did not match. 12

4. Zero-Crossing Point Matching: Finally, the full MPG is applied to those zero crossing points that did not match. An implementation of the rule-based system can be found in [7].

2.3.3 The Stereo Vision Integrated Approach

Ho and Ahuja [6] describe a system that integrates low level feature based matching with surface reconstruction at every point in the image. An overview of the algorithm involved ve steps: 1. Feature Extraction: The left and right images are each convolved with the 52G operator. Then zero-crossings are detected and labeled with the orientation of the gradient. 2. Fitting Planar Patches: Planar patches are t in circular image regions centered at each point along a regular grid with a radius in the range of W2D to 2W2D . The spacing of this grid is W2D ; which is the lter size. The two best t planes are kept. 3. Fitting Quadratic Patches: For each grid point, a quadratic patch is t to the point of the planar patches at this grid point and the planar patches of neighboring grid points. The quadratic patch with the minimum squared error is taken as the best tting quadratic patch. 4. Locating Contours: The occluding and ridge contours in the scene are detected by tting bipartite circular planar patches to each grid point. The bipartite patch is a circular region with two independent smoothed(planar) halves, separated by a depth discontinuity at the center. 5. Generating a Surface Map: To interpolate the depth at each point P on the surface, the closest patch or patches to point P are used, which do not lie across any occluding or ridge contours. An implementation of the integrated approach is under development. An investigation study of the previous three systems can be found in [8].

13

3 3D Model from Laser Scanned Data The idea of the 3D laser digitizer (scanner) is to scan the whole object by a laser beam.The re ectance of the laser beam depends on the depth and orientation of the object surface. A CCD array is used to convert this re ectance into the x,y,z coordinate system. The scanner has many degrees of freedom to be able to scan all the object details. The scanning process is repeated for di erent orientation and a registration technique [11] and [12] is used to register these di erent scans and obtain only one surface description for the scanned object. This surface description is a huge le that describes the object as points (x; y; z ) therefore it is de ned as 'cloud of data'. This cloud of data obtained from the scanner is a huge le and has a lot of redundancies. Therefore, this data are manipulated to eliminate these redundancies to obtain a small sucient object data to work with. Having a small and sucient object data, a triangle mesh is tted to get the 3D model. Di erent techniques have been developed to t a surface for a cloud of data. Among those techniques are [9] and [10]. A block diagram (Fig. 9) describes the process of building the 3D model from a scanner output. Cloud of Data X

Y

Z

428.001404 839.168579 -469.638611 -199.836731 900.001526 -370.589996 103.332520 997.833252 -291.755402 -606.498718 874.832153 -409.445801 318.168640 550.334961 -4.492000 158.332825 256.832123 -293.327606 460.998535 555.000305 -46.042999

Data Thinned and Redundancies Eliminated

3D Model

3D Solid Model A 3D surface is rendered

Physical Model

To The Rapid Prototyping Machine

Figure 9: Block diagram of building the 3D model from a scanner output.

14

A triangle mesh is fitted to the cloud of data

HUGE CLOUD of DATA

4 Orthodontic Applications In this section we provide our results of the above scanning approaches in the orthodontic eld. Speci cally, in the process of casting the jaw impression, and in separating individual tooth. Currently, the process of casting the jaw impression is performed manually in an unpleasant way (Fig. 10). A 3D model building approach can be used to perform this process in a cleaner and more comfortable way. Once the 3D model is obtained, dental procedures, such as teeth alignment, can be simulated using computer vision tools. The dentist station can be considered a complete orthodontic system using computer vision tools. An overview of the dentist station is shown in Figure 11. The rst phase of the dentist station is the 3D model builder. The main concern is to capture the physical object information using data aquisation hardware tools and convert this information into a 3D model that can be manipulated by computer vision software tools. The second phase of the dentist station is the dental surgery simulator. This simulator enables the dentist to perform a dental surgery on the 3D model of the jaw and visualize the results. The output of the simulator is the sequence of the simulated operation such as tooth movement or tooth alignment. For better visualization, the modi ed model of the jaw can be transformed to a rapid prototyping machine to build a physical object.

Figure 10: The current method of casting the jaw impression.

4.1 3D model building of a jaw

Preliminary results for 3D model from a laser scanned data is shown in Figure 12.

15

Dental Surgery Simulator

Physical 3D Model 3D Object Builder Model

Data Processing (Supervised & Unsupervised)

Tooth Movement

... ... Rapid Physical Prototyping Object Machine

... Visualization

... Implant

Figure 11: An Overview of The Dentist Station.

16

Cloud of Data (Digitizer Output)

Wireframe Model

Solid Model Figure 12: Building the 3D model of a jaw from a scanner output. 17

4.2 3D Tooth Separation and Recognition

In this section we describe a novel method for capturing the position and extracting the tooth from a 3D model of a jaw. Figure [13] shows the model obtained for the lower part of the jaw. All the analysis that is described in this section will be on that model, similar arguments can be made for the upper part of the jaw.

Figure 13: model of the jaw, (left)wire frame, (right) rendered model The wireframe model in gure 13 is formed of N small connected triangles, each triangle is de ned by its three points p1 ; p2 ; p3 and the normal to the triangle plane n. So we will denote each triangle as si (p1 ; p2 ; p3 ; n) where i 2 [1; N ]. Using this description of the wireframe triangles we can de ne the 3D model of the jaw as

n.

Sjaw = [Ni=1 si (p1 ; p2; p3 ; n) (28) Now we can divide the triangles forming the jaw into k groups according to the direction of the normal

[ Sjaw = (S1 ; S2 ; S3 ; :::; Sk ): where Sk = [n2Nk si (p1 ; p2 ; p3; n)

(29) (30) (31)

Figure 14 shows some of these groups. We must notice that Nk could contain normals that have opposite directions. The idea behind this is that we only are interested in the changes between the normals of adjacent triangles in the model and it is not likely to have two opposite normal for two adjacent triangle planes. Also we must notice that Sk is dependent on the orientation of the mold while scanning. However due to the wide range of normal directions in each Nk a small deviation in the orientation of the mold while scanning is compensated for. Still large deviations could cause problem in our algorithm, as will be discussed later. The next step in our algorithm is to consider S2 (N2 is the group of normals pointing upwards). Let us obtain the probability density distribution of the points forming the triangles of S2 in the three directions x, y and z , and we denote them Px (S2 ), Py (S2 ) and Pz (S2 ) respectively. Figure 15 shows the three probability distributions. 18

Figure 14: Examples of normal groups As one can observe, signi cant informations can be extracted from each one of the three distributions using all the tools of signal analysis, but Pz (S2 ) is the one we need to investigate more. We can notice in this distribution three important lobes. To realize the importance of each one we need to understand what S2 actually represents. S2 is the part of the jaw model that has all the normals of its triangles pointing upward. These triangles will be concentrated mainly in three regions in the jaw: Region 1. The upper surfaces of the teeth. Region 2. The upper surfaces of the lower roof of the mouth Region 3. The upper part of the gum near the insertion of each tooth. The later is the most important part for us, and we can clearly see the three regions represented in Pz (S2 ). Using averaging lters we can detect the end of this region 3 and know the corresponding z coordinate , denoted z . Using z  we can now de ne ; 19

# 3000.0

2000.0

1000.0

0.0 380.0

#

400.0

420.0

440.0

460.0

480.0 x

−247.5

−237.5

−227.5

−217.5

−207.5

#

3000.0

20000.0

15000.0 2000.0

10000.0

1000.0 5000.0

0.0 −220.0

−200.0

−180.0

−160.0

−140.0

0.0 −257.5

−120.0 y

z

Figure 15: S2 and the corresponding probability distribution  = Sjaw nfsi(p1 ; p2; p3 ; n)jp1 ; p2 ; p3 < z g (32) Sjaw z z z  . where pnz is the z coordinate of the point pn , Figure 16 shows the new model Sjaw Now let us look at S3 (N3 is the group of normals pointing to the front) and using also z  we obtain,

(33) S3 = S3 nfsi (p1 ; p2 ; p3 ; n)jp1z ; p2z ; p3z < z g    Then obtain the three probability distributions Px (S3 ), Py (S3 ) and Pz (S3 ) (see g 17). Again we can clearly see that using Px (S3 ) we can obtain three regions from the density distribution and let us denote xR and xL the resolution points obtained from this probability distribution. Using xR and xL ,

we de ne ( g 18);

20

 Figure 16: Sjaw and Sjaw

 &p1 ; p2 ; p3 > xR g SR = fsi (p1 ; p2 ; p3 ; n)jsi 2 Sjaw x x x   SL = fsi (p1 ; p2 ; p3 ; n)jsi 2 Sjaw &p1x ; p2x ; p3x < xL g  &XL < p1 ; p2 ; p3 < xR g SM = fsi (p1 ; p2 ; p3 ; n)jsi 2 Sjaw x x x

(34) (35) (36) (37)

also, we can de ne ( g 19);

S3R = fsi (p1 ; p2 ; p3 ; n)jsi 2 S3 &p1x ; p2x ; p3x > xR g S3L = fsi (p1 ; p2 ; p3 ; n)jsi 2 S3 &p1x ; p2x ; p3x < xL g S3M = fsi (p1 ; p2 ; p3 ; n)jsi 2 S3 &XL < p1x ; p2x ; p3x < xR g

(38) (39) (40) (41)

Proceeding in the same manner with S5 (N5 is the group of normals pointing to the sides) we can clearly separate the teeth, See Fig.[20]. Results of the tooth separation are shown in gure 21.

21

# 2000.0

1500.0

1000.0

500.0

0.0 380.0

400.0

420.0

440.0

460.0

480.0 x

#

# 1000.0

4000.0

800.0 3000.0

600.0 2000.0 400.0

1000.0 200.0

0.0 −220.0

−200.0

−180.0

−160.0

−140.0

0.0 −240.0

−120.0

−235.0

−230.0

−225.0

−220.0

y

Figure 17: S3 and the corresponding probability distributions

22

−215.0

−210.0 z

Figure 18: The processed jaw after segmentation

23

Figure 19: The di erent normal groups of S3

24

# 500.0

400.0

300.0

200.0

100.0 tooth1 2 3 4

0.0 −220.0

−200.0

5

−180.0

−160.0

−140.0

−120.0 y

Figure 20: Tooth separation using the probability distribution of S5

25

3D model of a jaw

3D model after removing the gum

Di erent clusters in the model

3D model of 5 di erent teeth

Figure 21: 3D tooth separation and recognition

26

References [1] A. C. Kak et. al., \Hand Book of Pattern Recognition and Image Processing: Computer Vision.,"In "Stereo Vision" (L. L. Grewe and A. V. Kak), chapter 8 Academic Press, 1994. [2] S. Ganapathy, \Decomposition of transformation matrices for robot vision," proc. IEEE Int. Conf. robot. Autom. 1984, pp. 130-139 (1984). [3] W. E. L. Grimson, "Computational experiments with a feature based stereo algorithm," IEEE Trans. Pattern Analysis and Machine Intelligence, PAMI-7(1):17-34, 1985. [4] S. Tanaka and A. C. Kak, "A rule based approach to binocular stereopasis," Technical Report TR-EE88-33, School of Electrical Engineering, Purdue University, 1988. [5] S. Tanaka and A. C. Kak, A rule based approach to binocular stereopasis,In \Analysis and Interpretation of Range Images" (R. C. Jain and A. K. Jain, eds.), Chapter 2, Springer-Verlag, Berlin 1990. [6] W. Ho and N. Ahuja, "Extracting surfaces from stereo images: An integrated approach," Technical Report UILU-ENG-87-2204, University of Illinois, 1987. [7] Adam Sandbek, "3D model reconstruction from stereo using hybrid features," Meng thesis, University of Louisville, Louisville, Kentucky, 1996. [8] Elsayed. E. Hemayed, Adam Sandbek, and A. A. Farag, "An investigation study of three di erent stereo systems," To be appeared in SPIE, 3D image capture, San Jose, Feb. 1997. [9] M. Agishtein and A. Migdal, "Smooth surface reconstruction from scatterd data points," Computer and Graphics, vol. 15, No. 1, pp. 29-39, 1991. [10] Tsang-Pao Fang and L. A. Piegl, "Deluany Triangulation in 3D," IEEE Computer Graphics and Applications, pp. 62-69, 1995. [11] A. C. Arbaugh, "Intelligent manufacturing processes including model-based reverse engineering," Meng thesis, University of Louisville, Louisville, Kentucky, 1996. [12] S. W. Roberts, "An Empirical evaluation of two iterative three-dimensional surface registration algorithms applied to an orthodontic application," Meng thesis, University of Louisville, Louisville, Kentucky, 1996.

27

Suggest Documents