Multiple View Geometry in Computer Vision - University of Louisville ...

67 downloads 67 Views 2MB Size Report
In this lecture, we will show that a camera is a map- ping from the 3D world R. 3 to a 2D image plane R. 2 . This mapping can be represented by a 3×4 matrix P.
Multiple View Geometry in Computer Vision

Prasanna Sahoo Department of Mathematics University of Louisville 1

Camera Models Lecture 8


In this lecture, we will show that a camera is a mapping from the 3D world R3 to a 2D image plane R2. This mapping can be represented by a 3 × 4 matrix P. We will examine the model for the following cameras: • Pinhole camera • CCD camera • Finite projective camera • General projective camera 3

Pinhole Camera A pinhole camera is a box in which one of the walls has been pierced to make a small hole through it. Assuming that the hole is indeed just a point, exactly one ray from each point in the scene passes through the pinhole and hits the wall opposite to it.


results in an inverted image of the scene. 4

Pinhole camera

The word camera has its origins in the Latin camera and the Greek kamara, both of which refer to a room or a chamber. 5

The inversion of the image is an annoyance.

• However, it can be corrected by considering a virtual image of the scene on a virtual plane parallel to the imaging plane but on the opposite side of the pinhole.


Basic pinhole camera model Let the center of projection be the origin of a Euclidean coordinate system. The plane Z = f is called the focal plane or image plane.


A point in space, R3, with coordinates X = (x, y, z)T is mapped to a point on the image plane where a line joining the point X to the center of projection meets the image plane. It can be easily shown that the point (x, y, z)T is 

mapped to the point fzx , fzy , f


on the image plane.


Ignoring the final image coordinate, we see that the mapping (x, y, z)


fx fy 7→ , z z


describes the central projection mapping from world to image coordinates. • This is a mapping from R3 to R2. 9

Some Termnilogies

• The center of projection is called the camera center or the optical center.

• The plane Z = f is called the focal plane or image plane.


• The line from the camera center perpendicular to the image plane is called the principal axis or principal ray of the camera. • The point where the principal axis meets the image plane is called the principal point. • The plane through the camera center parallel to the image plane is called the principal plane of the camera. 11

Central Projection Mapping Using homogeneous coordinates the central projection map (x, y, z)T 7→ (f x/z, f y/z)T can be described as  

 

    x x    fx f 0 0 0  y  y          = 0 f 0 0     7→   f y = diag(f, f, 1) [ I | 0 ] X,      z  z   z 0 0 1 0   1 1

where diag(f, f, 1) is a diagonal matrix and [ I | 0 ] is a matrix divided up into a 3 × 3 block (the identity matrix) plus a column vector made up of zeros. 12

The equation on the last slide can be written as

x = PX where X denotes the world point represented by the homogeneous 4-vector (x, y, z, 1)T, x the image point represented by the homogeneous 3-vector (fx, fy, z)T, and P the 3 × 4 homogeneous camera projection matrix. Hence

P = diag(f, f, 1) [ I | 0 ]. 13

In deriving the central projection mapping (x, y, z)T 7→ (fx/z, fx/z)T it was assumed that the origin of coordinates in the image plane was at the principal point. However, if the origin is not at the principal point, then we have (x, y, z)T 7→ (fx/z + px, fx/z + py )T where (px, py )T are the coordinates of the principal point. 14

Image and camera coordinate systems


Using homogeneous coordinates the central projection mapping can be described as  

 

    x x    f x + zpx f 0 px 0  y       y     = 0 f p     7→  f y + zp 0 = K [ I | 0 ] Xcam, y y      z  z   z 0 0 1 0   1 1

where K is a 3 × 3 matrix, [ I | 0 ] is a matrix divided up into a 3 × 3 block (the identity matrix) plus a column vector made up of zeros. 16

The equation on the last slide can be written concisely as

x = K [ I | 0 ] Xcam.

• The matrix K is called camera calibration matrix. • The homogeneous 4-vector (x, y, z, 1)T is written as

Xcam to emphasize that the camera is assumed to be located at the origin of a Euclidean coordinate system with the principal axis of the camera pointing straight down the z-axis. 17

Camera Location

The camera is assumed to be located at the origin of a Euclidean coordinate system with the principal axis of the camera pointing straight down the z-axis. 18

Camera Rotation and Translation In general, points in R3 will be expressed in terms of a different Euclidean coordinate frame, known as the world coordinate frame. The two coordinate frames are related through a rotation and a translation.


World and Camera Coordinate Frames

The two coordinate frames are related through a rotation and a translation. 20

˜ ia an inhomogeneous 3-vector representing the If X coordinates of point in the world coordinate frame, ˜ cam represents the same point in the camera and X coordinate frame, then ˜ cam = R (X ˜ −C ˜ ), X ˜ represents the coordinates of the camera cenwhere C ter in the world coordinate frame, and R is a 3×3 rotation matrix representing the orientation of the camera coordinate frame. 21

˜ cam = R (X ˜ −C ˜ ) can be written in The equation X homogeneous coordinates as  

  x ˜  ˜  R − R C R − R C     y        X. ˜  =  Xcam =    z   0 1 0 1 1 

This leads to the following concise formula ˜ ]X x = KR[I | − C where X is now in a world coordinate frame. 22

The mapping x 7→ X defined by the formula ˜ ]X x = KR[I | − C is the general mapping given by a pinhole camera. A general pinhole camera ˜] P = KR[I | − C has 9 degrees of freedom (3 DOF for K, 3 DOF for ˜ ). R, and 3 DOF for C 23

It is often convenient not to make the camera center explicit in the world to image transformation. Instead it is represented as ˜ cam = R X ˜ + t. X Hence the camera matrix becomes ˜ ] = K[R | t] P = KR[I | − C ˜. where t = −R C 24

CCD Cameras In pinhole camera model it is assumed that the image coordinates have equal scales in both axial directions. In the case of CCD cameras, this scale factors are unequal in each direction.


Image plane 3D Point



3D Point (X,Y,Z)

Center of projection

rge a Ch

le up o dC

vic e dD

p = (u,v,1)

m = (x,y,1)

Image coordinates: = Km CCDpCamera EURON Summer School on Visual servoing


Camera Geometry – p.9/43

Suppose the scale factors in the directions x and y are mx and my , respectively. Hence the calibration matrix

K for the CCD camera is given by 


mx 0 0 f 0 px f mx 0 px mx         K = 0 my 0 0 f py  =  0 f my py my  . 0 0 1 0 0 1 0 0 1   

Hence a CCD camera ˜] P = KR[I | − C has 10 degrees of freedom (that is 4 + 3 + 3 = 10). 27

Note the calibration matrix for the CCD camera can be written as 


α  x    0     0

0 x0  αy 0

   y0     


where αx = f mx, αy = f my , x0 = px mx, and y0 = py my . 28

Finite Projective Cameras A camera P is called a finite projective camera if the calibration matrix K is of the form 

αx s x0  K = 0 α y y0   0 0 1   


where s is a parameter known as the skew parameter. Hence a finite projective camera P is given by ˜ ]. P = KR[I | − C 29

• The 3 × 3 submatrix K R of ˜] P = KR[I | − C is non-singular. • If P is any 3 × 4 matrix for which the left hand 3 × 3 submatrix, say M, is non-singular, then M can be decomposed as M = KR, where K is a uppertriangular matrix of the form (1) and R is a rotation matrix. 30

Therefore if the 3 × 3 submatrix P is non-singular, then the 3 × 4 matrix P can be written as ˜] P = M [ I | M−1 p4 ] = KR[ I | − C where p4 is the last column of P. Thus we have: Result 5.1.

The set of camera matrices of finite

projective cameras is identical with the set of homogeneous 3 × 4 matrices for which the left hand 3 × 3 submatrix is non-singular. 31

General Projective Cameras A camera is called a general projective camera if it can be represented by an arbitrary homogeneous 3 × 4 matrix of rank 3. The rank 3 requirement is needed because if the rank is less than 3, then the range of the matrix mapping will be a line or a point but not the whole plane. 32