Stereoscopic camera system with creator-friendly ...

2 downloads 167 Views 496KB Size Report
Stereoscopic camera system with creator-friendly functions. Shinsuke .... The authors have developed nonlinear editing software for stereoscopic content [8].
Stereoscopic camera system with creator-friendly functions Shinsuke Kishia*, Nobuaki Abea, Takashi Shibataa, Takashi Kawaia, Makoto Maedab, Kouichi Hoshib a Graduate School of Global Information and Telecommunication Studies, Waseda University, 1011 Okuboyama, Nishi-Tomida, Honjo-shi, Saitama 367-0035, Japan b FLOVEL Co., Ltd, Tokyo 190-0003, Japan

ABSTRACT Stereoscopic filming is roughly divided into two types: toed-in and parallel camera configurations. Both types have disadvantages: toed-in cameras cause keystone distortions, and parallel cameras cause image loss by shifting. In addition, it is difficult for inexperienced creators to understand the optimal camera settings and post-processing procedures, such as cross points and inter-camera distance, in both types. These factors hinder the creation of stereoscopic images. Therefore, the authors focused on improving usability in stereoscopic filming, constructed an experimental camera system, and examined semi-automatic camera configuration function in terms of viewing safety. Keywords: stereoscopic images, stereoscopic camera, stereoscopic filming, ergonomics, human factors, safety, usability

1. INTRODUCTION Stereoscopic movies have become popular among the general public in recent times. The number of theaters presenting stereoscopic content increased in Japan since December 2007. In addition, broadcasting of stereoscopic TV programs for the commercial market by Nippon BS Broadcasting Corporation (BS11) also began in December 2007 [1]. International attention is thought to have turned to various market possibilities, ranging from the theater to home or ubiquitous use of stereoscopic images. Stereoscopic images have long been regarded as the next generation of image information media. However, they are not really widespread. The reasons can be divided roughly into three kinds: those related to stereoscopic displays, visual fatigue to the observer, and lack of suitable content. Problems in stereoscopic displays have been greatly resolved by the spread of high-definition TV. As to the equipments for content creation, especially the camera, the load to the creators as described later is still quite large, and this makes production of stereoscopic content difficult, especially with medium to small budgets. Further, there is increasing social concern for the safety of image content to the human eyes. The safety of stereoscopic image content and approaches to formulate guidelines are rapidly advancing in parallel with industrial activities aimed at broadening the market. For example, “3DC Safety Guidelines” [2], developed by the 3D Consortium, and the ISO standard “IWA3: Image Safety” [3] are well known internationally. Therefore, simple methodologies for creating stereoscopic images that can be safely and comfortably viewed can improve usability for creators. * [email protected]; phone +81-495-24-6076; fax +81-495-24-6645 Stereoscopic Displays and Applications XX, edited by Andrew J. Woods, Nicolas S. Holliman, John O. Merritt, Proceedings of SPIE-IS&T Electronic Imaging, SPIE Vol. 7237, 72371M © 2009 SPIE-IS&T · CCC code: 0277-786X/09/$18 · doi: 10.1117/12.807245 SPIE-IS&T/ Vol. 7237 72371M-1

2. PURPOSE When viewing stereoscopic images, depth is perceived as a result of the slight differences in the right and left images, and the convergence distance is changed along the depth axis. The represented position of the stereoscopic image is mainly determined by this right–left offset. Mismatch of the optic system (convergence and accommodation) can be identified as the main cause of visual fatigue from viewing stereoscopic images. This mismatch is due to the conflict of depth information between convergence and accommodation, which occurs because the convergence distance depends on the represented position, while accommodation is fixed near the screen. Consequently, various factors, such as excessive changes or excessively fast changes of parallax amount and/or longer presentation time, become critical when creating stereoscopic content [4] [5] [6] [7]. Further, in addition to right–left offset, parallax and the depth sensation of stereoscopic content vary with viewing distance and screen size. Specifically, the amount of parallax and visual load can increase with shorter viewing distance or larger display size, even if the right–left offset is fixed. Further, the viewing experience becomes closer to that for 2D images in cases with shorter viewing distance or smaller display size due to the reduction of binocular parallax. Thus, the advantages of stereoscopic images might be lost. Therefore, it is necessary to correct stereoscopic content for different viewing environments taking into account the human experience. Such considerations are particularly important for multipurposing in terms of addressing the lack of stereoscopic content. The methods of correction vary depending on the combination of filming and viewing environments, and there exists a limit to the range of corrections that can be made to adapt to a specific viewing and filming environment. Further, it is necessary to evaluate parallax and the represented position of stereoscopic images in terms of viewing safety and comfort. As a result, creators with less experience find it difficult to gauge the optimum conditions. For these reasons, it is necessary for an experienced specialist who can determine the proper values to evaluate the appropriateness of the amount of parallax that is actually reproduced and to verify the content. Thus, considerable additional work is required for stereoscopic content creation. Experimental evaluation of every segment of stereoscopic content during production and presentation would be burdensome and impractical. Thus, to reflect the guidelines relating to safety of stereoscopic images, it is necessary to incorporate them into the production flow and presentation system to reduce the load on the producer or distributor. For this reason, the authors have been developing a stereoscopic camera system with creator-friendly functions to assist in the production and presentation of safe and comfortable content by evaluating and optimizing according to various viewing environments. This paper outlines the experimental development.

3. DESIGN OF STEREOSCOPIC CAMERA SYSTEM 3.1 Parallax amount correction Two methods are commonly used for changing the amount of parallax in stereoscopic content. The first is to correct the right–left offset of the content after filming. The authors have developed nonlinear editing software for stereoscopic content [8]. It also provides a function to evaluate the viewing safety and comfort of content using image processing [9].

SPIE-IS&T/ Vol. 7237 72371M-2

This function supports creators in correcting content during the post-production process. With this software, an editing process optimized for stereoscopic content can be achieved by simple operations. In addition, StereoPhotoMaker [10], available as freeware to make stereoscopic content, is equipped with a function to correct undesirable right–left offset (including vertical offset) using image processing technology called the SIFT keypoint detector algorithm [11]. Furthermore, the authors investigated a scalable 3D conversion algorithm that corrects right–left offset to reduce the visual load based on the results of ergonomic evaluations [12]. On the other hand, the second method is to correct camera parameter settings before filming. Pockett et al. examined the camera parameters settings, and considered the effect of image and display characteristics on observers [13]. By changing the distance of the right and left cameras, the amount of parallax is optimized with measurement data of the furthest and nearest points. In this study, the authors designed a stereo camera system for trial purposes to consider usability for creators. One of the distinctive features of the system is a function to set camera parameters automatically before filming based on the specific viewing and filming environments, by image processing and mechanical control. The system also has functions to save the camera parameters representing the filming environment as metadata for various purposes of stereoscopic content. 3.2 Optimization of camera configuration Stereoscopic filming is of two types—toed-in and parallel camera configurations—both of which have disadvantages: the former causes keystone distortions while the latter causes image loss by shifting. In the toed-in configuration, excessive right–left offset becomes a problem with objects at considerably long or short distances due to the intersection of the optical axes. In actual filming, the best filming conditions based on the filming and viewing environments are commonly calculated each time. Meanwhile, after filming, keystone distortion must be corrected according to the intersection angle of the optical axes. Therefore, several additional points should be considered in stereoscopic filming compared with 2D filming. Consequently, it is difficult for creators who have little experience to obtain the intended stereoscopic images, and it takes significant time even for experienced creators. On the other hand, the original concern of creators is expression itself, such as the object or composition. If the creator wants to represent stereoscopic images similar to natural vision without added inputs, the filming conditions are semi-automatically decided. In this study, the authors paid attention to creators’ intentions and developed an experimental system with functions to decide parameters as automatically as possible, without affecting the creative intention. The system is equipped with stereo cameras (Flovel) mounted on rotating stages (Chuo Precision Industrial) in positions corresponding to the right and left eyes. The interval of the right and left stages can be adjusted by stepping motors, and the vergence angle and camera interval can be set with high accuracy. A large-scale rotating stage is installed at a position corresponding to a human’s neck so that the entire system can turn right and left. Both stepping motors that adjust the two cameras, the three rotating stages, and the right and left camera intervals are controlled by a PC with RS-232C and USB interfaces. In addition, the focus and zoom of the cameras are also controlled, and all of the above-mentioned camera parameters are stored as metadata.

SPIE-IS&T/ Vol. 7237 72371M-3

HO-S DI

HO camera CCU RS-232C

Stage driver USB

Fig. 1 Experimental stereoscopic camera system

When filmed content is presented at exactly the same size as the real object, the vertical and horizontal size and the depth are similar to those of the real object. Further, when parameters such as image size, inter-ocular distance, or viewing distance are increased or reduced by the same magnification, the vertical and horizontal size and depth of the represented image are changed in the same ratio (left of Fig. 2). However, the ratio of inter-ocular distance to other parameters actually changes since the human inter-ocular distance does not vary. When all distances other than the inter-ocular distance are halved (center of Fig. 2), the vertical and horizontal size of the represented image is halved while the depth is reduced. For instance, if a ball-shaped object was presented, it becomes flat in the depth direction. Consequently, when a stereoscopic image is presented in non-isometric conditions, the right–left offset must be adjusted according to the size (right of Fig. 2). Concretely, it can be adjusted by changing the inter-camera distance and vergence angle.

00 / \

/\

Fig. 2 Effects of screen size on vertical, horizontal and depth directions

The camera parameters of the experimental system are decided in the following order. The system is automatically adjusted to reflect the parameters chosen by the control PC. (1) Manual adjustment of the filming distance and viewing angle by creator (In this paper, the viewing angle was fixed at 50 mm.)

SPIE-IS&T/ Vol. 7237 72371M-4

(2) Detection of the represented image size from viewing conditions (Screen size and viewing distance) (3) Adjustment of the inter-camera distance and vergence angle for retaining the ratio of vertical, horizontal, and depth directions of real objects

4. CAMERA CONTROL METHOD FOR VIEWING SAFETY 4.1 Safety threshold of stereoscopic content Safety problems might still occur even if the theoretical parameters are set by the above-mentioned method before filming. If the inter-camera distance is large and the objects being filmed are located too near or too far from the vergence distance, the right–left offset becomes too great, and the amount of parallax becomes excessive. As a result, visual fatigue might increase. Previous studies have shown that visual fatigue increases in stereoscopic content with increasing parallax [14] [15]. From this viewpoint, the authors examined the thresholds where visual fatigue increases significantly, by referring to the previous studies and performing experiments. The thresholds were used in the system to adjust the inter-camera distance and vergence angle so that the maximum parallax amounts in the cross and parallel directions fall within the thresholds. As a result, control and reduction of the visual load was achieved. In the prototype system, the authors applied two functions for camera control. One evaluates the viewing safety of stereoscopic content using the authors’ past work [9]. The other corrects the camera parameters based on the evaluation results. Specifically, the parallax amount of each pixel is calculated corresponding to the viewing environment defined by the creator. Then, a warning is displayed if the ratio of the pixels exceeds the thresholds. The camera parameters, inter-camera distance, and vergence angle are modified according to the creator’s decision. The process chart of these functions is shown in Fig. 3.

Filming and viewing conditions

Optimization of camera configuration +

+

Left images

Right images

+

+

Safety threshold

Stereo matching

+

Evaluation of parallax map in terms of viewing safety

Change of camera configuration (if necessary)

Fig. 3 Sequence of camera control functions for maintaining viewing safety

4.2 Evaluation function of maximum parallax amounts In this function, the correspondence between pixels in the captured left and right images is calculated first with a block

SPIE-IS&T/ Vol. 7237 72371M-5

matching method using recursive correlation [16]. A map of the amount of parallax between the images is constructed. Then, the prescribed center part of the map in the horizontal direction is extracted based on the assumption that the main object in stereoscopic content is generally located in the central region. This step is used to calculate the maximum parallax amounts in the cross and parallel directions. Since depth becomes almost constant for each subject in natural images, it is thought that there is a bias in the occurrence ratio in the parallax map. Specifically, the occurrence ratio of the value corresponding to depth where the subject exists seems to be high, and is lower in other areas. The appearance rate corresponding to the depth where the object exists is thought to become higher. Therefore, it is thought that maximum values with a high ratio indicate objects with maximum parallax amounts in the cross and parallel directions. The ratio of one value is high and lowers in the vicinity if the object is planar and is situated perpendicular to the depth direction. However, cases where one subject extends over two or more values are common because most objects have finite thickness. The existence of an object is recognized based on the ratio of the values belonging to the side away from zero (corresponding to the screen plane) compared to the reference percentile value. In this function, the maximum parallax amount in the parallel direction was converted from the lowest prescribed percentile value, while that in the cross direction was converted from the highest prescribed percentile value.

Fig. 4 Evaluation example of maximum parallax amounts (left: original, right: parallax map superimposed to original) * Light gray pixels show maximum parallax areas in cross and parallel direction, and black ones show matching errors

4.3 Correction function of camera parameters The parallax map was made by the evaluation function, and the objects closest and farthest were identified from the map. Then, the parallax amounts closest to the threshold of safety for the combination of specified filming and viewing conditions were calculated. In the prototype system, an inter-ocular distance of 65 mm was used for the threshold in the parallel direction, while 1 degree of parallax was used for the cross direction. The authors aimed to maintain safety by keeping the parallax amount within these ranges. With regard to the correction of camera parameters, if the maximum parallax amount in the parallel direction exceeded the range, and there was a margin in the cross direction, shown on the left of Fig. 5, the inter-camera distance was

SPIE-IS&T/ Vol. 7237 72371M-6

changed, as shown in the center of Fig. 5. When it is not possible to keep the total parallax amount within the range by changing the inter-camera distance, both the inter-camera distance and vergence angle are corrected to keep the maximum parallax amounts in the cross and parallel directions closer to the thresholds. Fig. 6 shows an example of camera parameter correction.

Fig. 5 Concept of camera parameter correction

Fig. 6 Example of camera parameter correction (left: original parallax map, right: corrected parallax map) * White and dark gray pixels show exceeded the range, and black ones show matching errors

5. CONCLUSIONS AND FUTURE TASKS In stereoscopic content creation, the calculation and setting of camera parameters before filming must correspond to the viewing environment for suitable depth perception. Such a complex process can be considered a usability problem for creators. Therefore, in this study, the authors constructed a prototype stereoscopic camera system to examine the flow of stereoscopic filming from ergonomic viewpoints. In this system, the stereo camera configuration parameters of inter-camera distance and vergence angle were calculated and controlled based on the viewing distance and screen size. In addition, the authors developed two functions of the system: the evaluation of captured images and correction of

SPIE-IS&T/ Vol. 7237 72371M-7

camera parameters in terms of viewing safety. The evaluation function aims to extract the maximum parallax amounts in the cross and parallel directions and collate them with safety thresholds by image processing, and the correction function sets the filming conditions to keep the parallax amounts within the thresholds. To use the activities reported here for actual stereoscopic content creation, the authors consider the following as future tasks: (1) Evaluation and correction functions in terms of viewing comfort (2) Time-series analysis of stereoscopic content and feedback to the creation process (3) Increase of correctable camera parameters, such as focal length and zoom (4) Application of camera parameters for multi-purposing of stereoscopic content [12] (5) Trapezoid correction for keystone distortion by toed-in camera configuration (6) System integration and usability testing in content creation

REFERENCES [1]

Nippon BS Broadcasting Corporation, http://www.bs11.jp/ (in Japanese).

[2]

3D Consortium, http://www.3dc.gr.jp/.

[3]

ISO standards “IWA3: Image Safety”, http://www.iso.org/.

[4]

Iwasaki, T., “Eyestrain induced by stereogram on 3-D Display”, The Japanese Journal of Ergonomics, 38 (1), 44-53 (2002) (in Japanese).

[5]

Hoffman, D. M., Girshick, A. R., Akeley, K., Banks, M. S., “Vergence-accommodation conflicts hinder visual performance and cause visual fatigue”, Journal of Vision, 8 (3), 1-30 (2008).

[6]

Emoto, M., Niida, T., Okano, F., “Repeated vergence adaptation causes the decline of visual functions in watching stereoscopic television”, Journal of Display Technology, 1 (2), 328-340 (2005).

[7]

Ukai, K., “Visual fatigue caused by viewing stereoscopic images and mechanism of accommodation”, Proceedings of the First International Symposium on Universal Communication, 176-179 (2007).

[8]

Kawai, T., Shibata, T., Inoue, T., Sakaguchi, Y., Okabe, K., Kuno, Y., “Development of software for editing stereoscopic 3-D movies”, SPIE, 4660, 58-65 (2002).

[9]

T. Kawai, S. Kishi, T. Yamazoe, T. Shibata, T. Inoue, Y. Sakaguchi, K. Okabe, Y. Kuno, T. Kawamoto, “Ergonomic evaluation system for stereoscopic video production”, SPIE, 6055, 60551B-1-8 (2006).

[10]

StereoPhotoMaker, http://stereo.jpn.org/eng/stphmkr/index.html.

[11]

Lowe, D. G., “Distinctive image features from scale-invariant keypoints”, International Journal of Computer Vision, 60 (2), 91-110 (2004).

[12]

Kishi, S., Kim, S. H., Shibata, T., Kawai, T., Häkkinen, J., Takatalo, J., Nyman, G., “Scalable 3D image conversion and ergonomic evaluation”, Proceedings of the SPIE, 6803, 68030F-68030F-9 (2008).

[13]

Pockett, L. D., Salmimaa, M., “Methods for improving the quality of user-created stereoscopic content”, Proceedings of the SPIE, 6803, 680306-680306-11 (2008).

[14]

Inoue, T., Noro, K., Iwasaki, T., Ohzu, H., “Evaluation of stereoscopic 3D display in terms of human visual

SPIE-IS&T/ Vol. 7237 72371M-8

function”, The Journal of the Institute of Television Engineers of Japan, 48(10), 1301-1305 (1994) (in Japanese). [15]

Iwasaki, T., Tawara, A., “The permissible limit of binocular disparity gazing at 3-D display based on characteristic of accommodative responses”, The Japanese Journal of Ergonomics, 41(1), 24-29 (2005) (in Japanese).

[16]

Okada, K., Kagami, S., Inaba, M., Inoue, H., “Realtime Optical Flow Generation using Two Dimensional Recursive Correlation Technique”, Proc. of the 17th Annual Conference of the Robotics Society of Japan, 27-28 (1999) (in Japanese).

SPIE-IS&T/ Vol. 7237 72371M-9