3d camera using stereo matching chen xiao yang universiti ...

3D CAMERA USING STEREO MATCHING

CHEN XIAO YANG

UNIVERSITI TEKNOLOGI MALAYSIA

i

3D CAMERA USING STEREO MATCHING

CHEN XIAO YANG

A project report submitted in partial fulfilment of the requirements for the award of the degree of Bachelor of Electrical Engineering (Microelectronic)

Faculty of Electrical Engineering UniversitiTeknologi Malaysia

JUNE 2013

iv

ACKNOWLEDGEMENT

First of all, I would like to express my sincere appreciation to my final year project (FYP) supervisor, Dr. Shaikh Nasir bin Shaikh Husin, for his guidance and support in completing this project. His useful advice and patience in guiding me during the process has led to a successful modelling of the hybrid system upon achieving all the intended objectives for the project.

Besides, I also appreciate the advice and assistance from my beloved father and mother. Whenever I faced difﬁculties while carrying out my project, I tried to have some discussion with them. From the discussion, I was able to calm myself down and get some new ideas for my project.

Last but not least, I would like to thank all my family members and friends for their continual support while I was doing my project. I would also like to thank my parents for supporting me throughout my studies at UniversitiTeknologi Malaysia. With their encouragement and supports, I am motivated and determined to perform better in my academics.

v

ABSTRACT

3D modeling is a hot topic in nowadays society. The knowledge of 3D model have been used in many areas such as engineering, architecture, town planning and entertainment. Usually people obtain 3D from the beginning where they design it before manufacture the product. But what happen if people want to obtain the 3D digitalize model from an existed object? It would be useful if there are something that able to scan a model and give a 3D digitalized model. In this project, I make a camera using two webcams and digitalize the real object. The method that was used is stereo matching. The model of webcam that I used is DM-W6651. A program in C++ programming language is written with the help of OpenCV library. The main purpose of the program is to analyze the distance of a pixel from the object. A computer acts as server, connecting two webcams and obtains the image input from the webcams. The two webcams are placed side by side. The images captured by the webcams are sent to computer to be analyzed. By using the steroid matching method, a program is written to calculate the distance difference between 2 coordinates at the same points from 2 images. This process was written using C++ programming language and also OpenCV library. The structure further record into a 3D matrix. The output will be display in the monitor using Graphic User Interface (GUI). After development on computer, the executable is transferred to Intel Atom development kit for portable deployment.

vi

ABSTRAK

Model 3D merupakan salah satu topik yang hangat dalam masyarakat pada masa kini. Pengetahuan model 3D telah digunakan dalam banyak bidang seperti kejuruteraan, seni bina, perancangan bandar dan hiburan. Biasanya orang menghasil model 3D sebelum menghasilkan produk tersebut. Tetapi apa yang berlaku jika orang ingin mendapatkan model 3D digitalize dari objek yang telah wujud? Jika terdapat sesuatu yang dapat mengimbas model dan memberikan model digital 3D akan meberikan kemudahan kepada orang. Dalam projek ini, saya memgabungkan dua kamera menggunakan webcam dan menghasilkan model digital objek sebenar. Kaedah yang digunakan adalah membandingkan dua titik dari gambar yang berbeza. Model webcam yang saya digunakan adalah DM-W6651.

Satu program dalam

bahasa pengaturcaraan C + + dituliskan dengan bantuan perpustakaan OpenCV. Tujuan utama program ini adalah untuk menganalisis jarak piksel dari objek. Komputer bertindak sebagai pusat yang menghubungkan dua webcam dan mendapat gambar dari webcam.

Kedua-dua webcam ditempatkan bersebelah.

Gambar

ditangkap oleh webcam yang dihantar ke komputer untuk menganalisis. Dengan membandingkan titik-titik dari gambar berlainan, satu program telah ditulis untuk mengira perbezaan jarak antara 2 Koordinat dari 2 gambar. Proses ini ditulis dengan mengunnakan bahasa pengaturcaraan C + + dan juga perpustakaan OpenCV. Struktur direkodkan dalam matriks 3D. Hasilnya akan dipaparkan di monitor dengan menggunakan Antara Muka Pengguna Grafik (GUI).

Selepas it, program itu

dipindahkan ke Intel Atom development kit untuk penempatan mudah alih.

vii

TABLE OF CONTENTS

CHAPTER

TITLE

PAGES

THESIS STATUS CONFIRMATION FORM SUPERVISOR CONFIRMATION

1

2

TITLE COVER

i

DECLARATION

ii

DEDICATION

iii

ACKNOWLEDGEMENT

iv

ABSTRACT

v

ABSTRAK

vi

TABLE OF CONTENTS

vii

LIST OF TABLES

x

LIST OF FIGURES

xi

LIST OF SYMBOLS

Xii

LIST OF APPENDICES

xiii

PROJECT REVIEW

1

1.1 Introduction

1

1.2 Problem Statement

2

1.3 Objectives

3

1.4 Scope

3

1.5 Project Flow

3

LITERATURE REVIEW

5

2.1 Introduction

5

2.2 Stereo Matching

5

2.3 GUI (Graphic User Interface)

6

2.4 C++ Programming Language

7

2.5 OpenCV

8

viii

3

METHODOLOGY

10

3.1 Introduction

10

3.2 Hardware implementation

11

3.3

3.4

4

5

3.2.1

The webcams

12

3.2.2

Computer

13

Software implementation

13

3.3.1

Initialize the library

14

3.3.2

Get the image from the webcam

16

3.3.3

Image Adjustment

17

3.3.4

Matching two point

19

3.3.5

Showing the cross section of the model

22

The overall flow of data

24

RESULT AND DISSCUSSION

25

4.1

Introduction

25

4.2

XY-plate

26

4.3

YZ-plate

27

4.4

XZ-plate

28

CONCLUSION

29

5.1

Conclusion

29

5.2

Problems

30

5.3

Recommendation

30

REFERENCES

32

APPENDIX A

33

ix

LIST OF FIGURES

FIGURE

TITLE

PAGES

1.1

Project flow

4

3.1

The original look of the webcam

12

3.2

The webcams that have been modified and combined

12

3.3

The Intel Atom Board which the model is

13

Innovation kit 3 3.4

Flow chart of the software implementation

14

3.5

The overall flow of the project from the hardware until

24

software 4.1

Image capture by the webcams

25

4.2

Comparison between the XY cross-section and the

26

original image 4.3

YZ cross section

27

4.4

XZ cross section

28

5.1

Example of image taken by the camera

30

x

LIST OF SYMBOLS

Ax,y Bx,y Y1x,y

-

Result image pixel at coordinate (x,y) Original image pixel at coordinate (x,y) Light intensity of 1st image

R1x,y

-

Red intensity of 1st image

G1x,y

-

Green intensity of 1st image

B1x,y

-

Blue intensity of 1st image

Y2x,y

-

Light intensity of 2nd image

R2x,y

-

Red intensity of 2nd image

G2x,y

-

Green intensity of 2nd image

B2x,y

-

Blue intensity of 2nd image

Ax,y Bx,y Y1x,y

-

Result image pixel at coordinate (x,y) Original image pixel at coordinate (x,y) Light intensity of 1st image

R1x,y

-

Red intensity of 1st image

G1x,y

-

Green intensity of 1st image

B1x,y

-

Blue intensity of 1st image

Y2x,y

-

Light intensity of 2nd image

R2x,y

-

Red intensity of 2nd image

G2x,y

-

Green intensity of 2nd image

B2x,y

-

Blue intensity of 2nd image

xi

LIST OF APPENDICES

APPENDIX

A1

TITLE

The Overall code that compile by Microsoft Visual Studio

PAGES

33

1

Chapter 1

INTRODUCTION

1.1

Background of Study

Nowadays, 3D technology already becomes parts of our life. 3D technology have been used in many area from work to entertainment.

In civil engineering, the design processes of a building require the civil engineer to design in 3D model first before the start the work on build the building. Furthermore, they need to analyze the structure is it safe to build or not by using a 3D model and apply some of the physic theory on it.

2

In mechanical engineering, the mechanical engineers usually design their project in 3D software such as AutoCAD and Solid Work. They can run simulation on the project they done to determine that the structure that they already design will face a problem when come to use it.

In the field of architecture, the architect need to design their work in 3D model so that they can know their idea is actually workable or not. They need to use the 3D model to analyze of space used and also the design matched the standard or not.

In the field of entertainment, there is a lot of games used 3D technology such as Counter Strike and some of the RPG games. Furthermore, 3D technology have already implemented into movie such as Transformer.

1.2

Problem Statement

The digital 3D model nowadays used in many field such as entertainment and engineering. The use of 3D model in designing is widely use. There are many requirements on solid model that need to analyze in digital 3D model in different area such as architecture, engineer and entertainment.

We need to analyze the shape of a real object using software and in virtual. The implementation of reverse engineering will actually be required by the industry in producing a 3D virtual model from a real model.

3

1.3

Objective of Project

The main objective of this project is to develop a camera that can give a 3D model input to further analysis. In this project, the object needed to be identified and use the the image that capture by the webcam need to be analyzed by the algorithms. The object that analyze by the computer further recorded an 3D model into a 3D metric.

1.4 Scope of Project

This project is limited to manipulating webcam images by using algorithm and analyzing image from the input from webcam to become a 3D model.

1.5

Project Flow

Figure 1 show the project flow involved in constructing a 3D camera. At the beginning, learn the OpenCV library coding is studied. After that a model of camera will be build and the data will be collected by using the camera.

4

Figure 1.1: Project flow

5

Chapter 2

LITERATURE REVIEW

2.1

Introduction

This chapter will discuss about the theories and literature review based on the journals and conference paper which had been published. This chapter explains about 3D camera and its application.

2.2

Stereo Matching

The Stereo Matching is used to compute the distance of the object to the camera. In other words, stereo matching is a method to have debt analysis on the given image.

6

By finding the same point from two different images and calculating the parallel difference from those images, we can deduce how far the object and the point from the webcam.

This technique is widely used in many areas such as unmanned vehicle, 3D movie and so on.

2.3

GUI (Graphic User Interface)

In computing, GUI is actually a method that help user to easy understand the the choices that have and give the commend to the computer easily. GUI is developed to communicate between human and computer via image (graphic).

GUI has been used in many area such as mobile phone, ATM machine and many of the computer application.

GUI is required because there are many data that required to be shown in graphic form. It is important that to have GUI so that we can further be analyze on the image that given.

7

2.4

C++ Programming Language

There are a lot of programming langue such as Visual Basic, Python , Java and so on. C++ programming language is one of the general purpose programming language. Like most imperative languages in the ALGOL tradition, C++ has facilities for structured programming and allows lexical variable scope and recursion, while a static type system prevents many unintended operations. Its design provides constructs that map efficiently to typical machine, and therefore it has found lasting use in applications that had formerly been coded in assembly language, most notably system software like the Unix computer operating system.

The theory behind C++ programming is using loop to exhibit the process. It have certain comment such as for, if/else, while, switch, and do/while to do process a loop.. C programming language is contain a compiler to compiler the code that you have been written into a program.

The C language also exhibits the following characteristics. [6] There are a large number of arithmetical and logical operators, such as +, +=, ++, &, ~, etc. More than one assignment may be performed in a single statement. Function return values can be ignored when not needed. Typing is static, but weakly enforced: all data has a type, but implicit conversions can be performed; for instance, characters can be used as integers. Declaration syntax mimics usage context. C has no "define" keyword; instead, a statement beginning with the name of a type is taken as a declaration. There is no "function" keyword; instead, a function is indicated by the parentheses of an argument list. User-defined (typedef) and compound types are possible. Heterogeneous aggregate data types (struct) allow related data elements to be accessed and assigned as a unit. Array indexing is a secondary notion, defined in terms of pointer arithmetic. Unlike structs, arrays are not first-class objects; they cannot be assigned or compared using single built-in operators. There is no "array" keyword, in use or definition; instead, square

8

brackets indicate arrays syntactically, e.g. month. Enumerated types are possible with the enum keyword. They are not tagged, and are freely interconvertible with integers. Strings are not a separate data type, but are conventionally implemented as nullterminated arrays of characters. Low-level access to computer memory is possible by converting machine addresses to typed pointers. Procedures (subroutines not returning values) are a special case of function, with an untyped return type void. Functions may not be defined within the lexical scope of other functions. Function and data pointers permit ad hoc run-time polymorphism. A preprocessor performs macro definition, source code file inclusion, and conditional compilation. There is a basic form of modularity: files can be compiled separately and linked together, with control over which functions and data objects are visible to other files via static and externattributes. Complex functionality such as I/O, string manipulation, and mathematical functions are consistently delegated to library routines.

C is one of the most widely used programming languages of all time, and C compilers are available for the majority of available computer architectures and operating systems.

2.5

OpenCV

OpenCV (Open Source Computer Vision) is a library of programming functions developed by Intel for real-time computer vision. It is free for commercial and research use under a BSD license. The library is cross-platform, and runs on Windows, Linux, Mac OS X, mobile Android and iOS. OpenCV is written in C++ and its primary interface is in C++.

9

OpenCV provides functions and facilities such as [7]: • 2D and 3D feature toolkits • Egomotion estimation • Facial recognition system • Gesture recognition • Human–computer interaction (HCI) • Mobile robotics • Motion understanding • Object identification • Segmentation and Recognition • Stereopsis Stereo vision: depth perception from 2 cameras • Structure from motion (SFM) • Motion tracking • Augmented reality

10

Chapter 3

METHODOLOGY

3.1

Introduction

In the project, it consist of two parts, that is the software and the hardware part. This two part work together to get the project to be function.

In the hardware part, it is actually is consist of two webcams. The two webcams is to capture the image of the object that wanted to be analyzed. The webcams model are Webcam (DM-W6651). The software part is a system that to save and analyze the 3D model which is a 3D matrix. This part I do it with the help of the library of OpenCV integrated into Visual Studio C++.

11

The theory behind the project is using stereo matching to analyze the images that capture by webcams. Hence, we will know the distance of the object and record it into a 3D matrix. The result of this project is a 3D matrix that represent the surface of the images capture.

The overall data flow is as follow. Computer acts as server, connecting two webcams and obtains the image input from the webcam. The two webcam places in pair side by side. The images captured by the webcams are sent to computer to analyze. By using the steroid matching method, a program is written to calculate the distance difference between 2 coordinates at the same points from 2 images. The process of the programming language used C++ programming and also OpenCV library. The structure further record into a 3D matrix. The output will be display in the monitor using Graphic User Interface (GUI).

3.2

Hardware implementation

In this section, the hardware that is used will be discussed. The hardware that I choose to use is two webcams which are DM-W6651 and Intel Atom Board Innovation kit 3. For the webcams, the cover was removed and placed together which the focus point is 2.5cm apart.

12

3.2.1 The webcams

The first part of the project is the webcams which the model are DM-W6651. In order to place the webcams side by sides, the cover was removed from as show in Figure 3.2 to Figure 3.3.

The two webcams are being tied by string and stable by chopstick. Further it is placed on a box to be stabilized.

Figure 3.1:The original look of the webcam

Figure 3.2: The webcams that have been modified and combined

13

3.2.2 Computer

The computer that is used is Intel Atom Board which the model is Innovation kit 3. The operating system of this computer is Window 7 and installed Microsoft Visual Studio 2010 with OpenCV 2.4.5 integrated with it.

Figure 3.3: The Intel Atom Board which the model is Innovation kit 3

3.3

Software implementation

The software implementation is the algorithm to analysis the image that capture by webcams. It being implemented by using C++ programming language with OpenCV library. Figure 3.5 show the flow of the programming that implemented.

14

Figure 3.4: Flow chart of the software implementation

3.3.1 Initialize the library

From the beginning of the code, we need to first initialize the library. Before writing the code, which library needed to be include need to be determine.

15

In the project, the following library included. #include #include #include #include #include #include "3D.h" Which is the library that contain the basic library of OpenCV which enable you to analyze on the image that captured. and is the libraries that contain the window operation function which enable you to use the function like pressing a key on mouse or keyboard. is the library that enable you to view your result by using Graphic User Interface(GUI). The "3D.h" is the header file for the code that written. Following is its code.

#include #include #include #include #include

void accu(cv::Mat &image,cv::Mat image1) { int a; for (int i=1; i