Webized 3D content streaming system for ... - ACM Digital Library

5 downloads 226556 Views 10MB Size Report
Jun 5, 2017 - world objects, 3D models, and HTML DOM elements through. HTML documents .... The SMV content uses an anchor tag to link external 3D ...
Webized 3D content streaming system for autostereoscopic 3D displays *

Daeil Seo Korea Institute of Science and Technology
[email protected]

Byounghyun Yoo§ Korea Institute of Science and Technology [email protected]

Heedong Ko Korea Institute of Science and Technology [email protected]

• Human-centered computing → Interaction paradigms; Web-based interaction; • Computing methodologies → Computer graphics; Rendering

content, such as 360° panoramic and virtual reality (VR) video streaming, which provides 3D experience. VR head-mounted displays (HMDs) and 3D displays have entered general circulation. Autostereoscopic super multi-view (SMV) displays are a kind of 3D display that improves visual fidelity by providing motion parallax as well as improves the user’s convenience by eliminating the necessity of wearing glasses for stereopsis. In terms of visual fidelity and immersion through motion parallax, the completeness of the SMV displays is sophisticated, but there are limited types of content for the displays. For the proliferation of the SMV 3D displays, a way is needed to create and render SMV 3D streaming content considering the need for SMV 3D videos in the market, such as future 3D TV broadcasting. In this paper, we propose a webizing method that can provide SMV 3D streaming content with time synchronization on the web for SMV 3D displays. The proposed method captures realworld environments and uses existing resources, such as 3D models and HTML documents on the web, to create SMV 3D streaming content. We use CSS extensions for providing transition and animation effects for the SMV 3D streaming content triggered by time events.

ADDITIONAL KEYWORDS AND PHRASES

2 RELATED WORK

Webizing, transition and animation, content streaming, mixed HTML5 and WebGL, super multi-view

An autostereoscopic SMV 3D display presents a large number of viewpoints with smaller intervals between adjacent viewpoints than the size of the pupil of the human eye for creating the illusion of three-dimensional depth to give the appearance of continuous motion parallax without the use of any eyewear. An SMV application renders a 3D scene on off-screen with the first viewpoint of the camera and moves the camera to each of viewpoint. The application generates an image on each N viewpoint, renders N images on off-screen by moving the camera viewpoints, and multiplexes the images to SMV content on a screen. • SMV 3D scene description: To create 3D scene for SMV applications, the following two methods can be used. At first, a 3D scene is created by capturing real objects using physical cameras. The cameras acquire the shape, color, arrangement, and depth information of real objects. The Müller et al. [Müller et al. 2011] provide an overview of three-dimensional video (3DV) systems using depth data that enables the generation of virtual views through depth-based image rendering (DBIR) techniques. The point cloud library (PCL) [Rusu and Cousins 2011] is a library for n-D point clouds and 3D geometry processing written in C++. Alternatively, virtual objects are generated by declarative or procedural methods. The declarative 3D method is used to describe a 3D scene. X3DOM [Behr et al. 2009] or XML3D [Sons et al. 2010] adds DOM elements to an HTML

ABSTRACT Various types of video content using 360° panoramic and virtual reality video have become widespread. However, wearing eyewear as well as a head-mounted display is still a burden of experiencing in 3D content. Glasses-free (i.e. autostereoscopic) super multi-view (SMV) 3D displays provide more comfortable 3D experience but have limited types of content. To provide enriched SMV content, we propose a webizing method for SMV 3D streaming content with time synchronization. The proposed method provides SMV 3D streaming content that contains realworld objects, 3D models, and HTML DOM elements through HTML documents. The method defines custom CSS extensions for transition and animation effects of the SMV 3D streaming content. We present an example of SMV 3D streaming content using a prototype implementation on the web to verify the 1 usefulness of our approach.

CCS CONCEPTS

ACM Reference format: Daeil Seo, Byounghyun Yoo, and Heedong Ko. 2017. Webized 3D content streaming system for autostereoscopic 3D displays. In Proceedings of 22nd International Conference on Web3D Technology, Brisbane, QLD, Australia, June 2017 (Web3D’2017), 6 pages. DOI: 10.1145/3055624.3075940

1 INTRODUCTION The advancement of broadband networks and camera technology has accelerated the development of video content on the web. A large number of users can create content and upload it on the web. In addition, there are various types of video *

e-mail: [email protected] Corresponding author. e-mail: [email protected] Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author. Copyright is held by the owner/author(s). Web3D '17, June 05-07, 2017, Brisbane, QLD, Australia ACM 978-1-4503-4955-0/17/06. http://dx.doi.org/10.1145/3055624.3075940 §

Web3D’2017, Brisbane, Australia

D. Seo et al.

document to create 3D objects as part of the HTML document. The webized 3D content system [Seo et al. 2015] renders HTML DOM elements and 3D objects on the 3D web using a shared 3D layout. Web Graphics Library (WebGL) [Jackson and Gilbert 2016] is a procedural JavaScript API to render 3D graphics on the 3D web. To easily declare a SMV 3D scene using combination of physical and virtual objects, a way needs to declare scene elements in the same way that is independent from types of scene elements such as a physical and virtual object type. • Rendering SMV 3D scene: To render multi-view display content, Kooima et al. [Kooima et al. 2007] used a graphics processing unit (GPU) for vertex and fragment processing with OpenGL Shading Language (GLSL) to render autostereoscopic content. The webized SMV rendering method [Seo et al. 2016] supports varying SMV display profiles and renders 3D models with WebGL and HTML DOM elements on the web within any compatible web browser without the use of plug-ins. The webized SMV rendering method uses WebGL to render 3D graphics using JavaScript language and CSS 3D Transform [Fraser et al. 2013] for the arrangement of HTML DOM elements in two-dimensional or three-dimensional space. However, these SMV content rendering methods are not suitable for streaming content because they consider only a single scene. The streaming SMV content is dynamically changed as time goes by as usual in video channels. A scene structure of the streaming SMV content can describe time and sequence information, timed event trigger, and callback functions. • Streaming SMV 3D scene: To support SMV 3D streaming content on the web, the SMV content model should support three-dimensional content animation and transition based on a timeline. The synchronized multimedia integration language (SMIL) [Bulterman et al. 2008] is an XML-based markup language that allows a user to easily define and synchronize interactive multimedia presentations for timing, animations, and transitions. W3C's CSS Transitions [Jackson et al. 2012], CSS Animations [Jackson et al. 2012], and Web Animations [Birtles et al. 2013] are declarative language models to describe the synchronization and timing of changes to the presentation of a web page. However, these methods only deal with 2D content and use specific content types. To support time synchronization for SMV 3D streaming content, a way should be provided for 3D models and HTML DOM elements in the same way.

3 WEBIZING SMV 3D STREAMING SYSTEM Figure 1 depicts an overview of the proposed webizing SMV 3D

streaming system. The proposed method presents the webized SMV 3D content by web technologies as shown in Figure 1(a), transmits the SMV 3D content streaming to web browsers as shown in Figure 1(b), and renders the SMV content on the web browser without additional plugins as shown in Figure 1(c). In this paper, we use the webized SMV 3D display profile [Seo, et al. 2016] to access SMV 3D display characteristics and extend the webized SMV 3D content [Seo, et al. 2016] to support dynamic SMV 3D streaming scene with timeline-based animation and transition. The autostereoscopic SMV 3D displays are implemented using lenticular screens, and a user can see 3D content without wearing glasses. The webized SMV 3D display profile defines various characteristics of SMV 3D displays such as physical display size, screen resolution, the number of viewpoints, subpixel index, the distance between a user and the display, and the offset between the viewpoints of the display. The webized SMV renderer accesses the display specification by JavaScript and performs a spatially multiplexed imageinterleaving algorithm based on the display profile to render final content from a series of N different point-of-view images. The declarative webized SMV 3D content consists of a combination of three elements: 3D models, HTML5, and point cloud data (PCD) file, as shown in Figure 1(a). The webized SMV 3D content [Seo, et al. 2016] extends the declarative webizing monoscopic 3D method [Seo, et al. 2015] described by 3D models and HTML DOM elements in an HTML document. Firstly, the content uses 3D model files that have 3D geometries and textures. The SMV content uses an anchor tag to link external 3D model files in the HTML document. The href attribute denotes 3D model’s URL, and type attribute describes the mime-type of the 3D model. The SMV 3D content uses “-smv-transform” CSS extensions to define translate, rotation, and scale values of 3D assets in the shared 3D layout. The extension is based on CSS 3D Transforms [Fraser, et al. 2013]. Secondly, the declarative webized SMV 3D content uses HTML5 DOM elements. The method uses “--smv-transform” CSS extensions to extend a scene layout from 2D to 3D. In addition to the webized SMV 3D content, this paper provides a method to capture real world environment, as does Müller et al.’s approach [Müller, et al. 2011]. The proposed method takes more than two images by the camera to get RGB colors and depth information by depth provision, and saves the information as a PCD file. We also use an anchor tag, , to link PCD file with another mime-type in the same way. These elements of the webized SMV 3D content comprise a single scene

Figure 1: Webized rendering outline of SMV 3D streaming content.

Webized 3D content streaming system for autostereoscopic 3D displays

Web3D’2017, Brisbane, Australia

in the shared 3D layout.

3.1

Webized SMV 3D streaming content

To declare an SMV 3D streaming content with time information, a way to describe a time event in the SMV 3D content is required. We propose a method to declare a scene sequence and the content transition according to time synchronization that extends W3C’s recommendation. In HTML5, the
tag represents a complete, independent, and self-contained content. The proposed method uses the HTML5
tag to represent each independent scene and renders each scene based on timeline. In SMV 3D streaming content, the
tag is a container of SMV 3D content elements, which are declared in Figure 1(a). Because W3C's CSS Transitions [Jackson, et al. 2012] and CSS Animations [Jackson, et al. 2012] only support HTML DOM elements in 2D layout, the proposed method also defines “-smv-transition” and “--smv-animation” CSS extensions, as shown in Table 1 to support “--smv-transform” CSS extensions for three elements of the SMV 3D streaming content in 3D layout, as mentioned above in Figure 1(a). The proposed method adds a new attribute, “autostart”, to the “--smv-transition” for starting transition without a user event. The “--smv-animation” is specified by the name of keyframes to define the stages and styles of the animation for SMV scenes and elements. Table 1: CSS extensions for SMV 3D streaming content CSS extension --smv-transition --smv-animation

Shorthand property [property] [duration] [timing-function] [delay] [autostart] [name] [duration] [timing-function] [delay] [iteration-count] [direction] [fill-mode] [play-state]

Figure 2 depicts an example of an HTML document of webized SMV 3D streaming content using the declarative method. The example has two scenes declared by
tag. The first scene for which the id attribute is “scene1” has a PCD stream that is defined by tag with “model/pcd” type property, and three images declared by the HTML tag. The transform of the image is declared by “--smv-transform” CSS extension. To show the scene, “--smv-animation” CSS extension is declared and the animation event is triggered at a given time. The animation effect is defined by “@keyframes smv-visible”. The second scene for which the id attribute is “scene2” has another PCD stream, three images, and a 3D model. The transform of each image is defined by “--smv-transform” and the visibility attribute is changed by “@keyframes smv-visible” and “@keyframes smv-hidden” based on each parameter to show and hide images. The 3D model, the id attribute is “arrow”, is declared “--smv-animation” and “--smv-transition” CSS extensions. The transition changes the position of the arrow model declared by “--smv-transform”.

3.2

Streaming webized SMV 3D content

In streaming process as shown in Figure 1(b), the webized SMV 3D content is delivered to the web browser. To transmit the PCD streaming on the web, we define a PCD stream over HTTP, similarly to MJPEG over HTTP [Wikipedia]. The source of the PCD streaming is identified by the URL, and each PCD is encoded and decoded using Base64 encoding. The PCD streaming has PCD-encoded data and a boundary string for separating each PCD encoded data. Figure 3 is an example of PCD streaming.

Figure 3: Example of PCD stream over HTTP

Figure 2: Example of webized SMV 3D streaming content.

When the web browser has received the PCD streaming, the web browser extracts the each encoded PCD content using boundary information. The decoded PCD content is rendered on the web browser with 3D models and HTML DOM elements, as shown in Figure 1(c). The web browser chooses SMV content according to the timeline and applies transition and animation effects on the content described by CSS. In the rendering result, the PCD data and 3D models are arranged in a shared 3D layout. To add HTML DOM elements into the shared 3D layout, the proposed method creates a plane surface geometry with an image texture that has the same size, position, and transform values as the source HTML DOM element. When the source HTML DOM element is updated, the image is also updated

Web3D’2017, Brisbane, Australia synchronously. Thereby, the proposed method adds a component for blending the PCD data, 3D models, and HTML DOM elements in the single 3D scene with the shared 3D layout. The display profile provides a characteristic of the SMV 3D display. The proposed method generates N, the number of viewpoints provided by a SMV 3D display, images by changing the camera position to offset according to the display profile. Lastly, the method multiplexes N images into a final SMV 3D content based on the indexmap in the SMV 3D display profile and renders the SMV 3D content on the screen.

4

D. Seo et al. types of SMV 3D content, PCD data, 3D models, and HTML DOM elements, on the shared 3D layout and generates a final content according to the process as given in Figure 1(c).

PROTOTYPE IMPLEMENTATION

We implemented a prototype streaming system for webized SMV 3D streaming content on autostereoscopic SMV 3D displays. To verify the usefulness of our approach, we applied the prototype system to various usages of SMV 3D streaming content on the 3D web using SMV 3D displays. The experiment was performed using Intel’s i7-7700k CPU and NVIDIA’s GTX 1080 GPU on Google Chrome web browser version 56 (64-bit). Figure 4(a) shows the experiment environment to capture PCD streaming in real world environment. To generate PCD streaming, we used Intel’s RealSense™ SR300 depth camera and the point cloud library [Rusu and Cousins 2011]. The user can move side-by-side on arbitrary viewpoints to see seamless SMV 3D streaming content with pseudo continuous motion parallax as shown in Figure 4(b). Figure 5 shows an overview of the prototype system architecture that extends webized SMV rendering system [Seo, et al. 2016]. The prototype consists of five parts: the display profile for accessing a display profile using JavaScript; the content store provider for managing HTML documents and 3D models; the PCD streaming server, which is new component to transmit PCD stream over HTTP; the 3D content renderer for generating a 3D integrated scene graph with time based animation and transition by the webized content synchronizer; and the SMV content renderer for rendering the final SMV 3D content on the screen. The PCD streaming server is implemented by node.js and the web browser requests PCD streaming over HTTP using the URL. The SMV content renderer on the web browser arranges three

Figure 5: Overview of prototype system architecture for streaming and rendering webized SMV 3D content. Figure 6 shows rendering results of SMV 3D streaming content on arbitrary three among N viewpoints of the SMV 3D display. The SMV 3D streaming content in this experiment has two scenes and each scene has three cuts. The rendering result is changed by time events Figure 6(a) to (f). As the user moves sideby-side as shown in Figure 4(b), the user feels motion parallax and watches different rendering result because of the lenticular lens on the display. The first scene contains a Japanese wedding example as shown in Figure 6(a) to (c) and the scene has three images for each cut. The example has a real object in the center of the scene. Figure 6(a) is the first cut of the first scene. The image, an HTML DOM element, in the top right corner shows information about a wedding dress that the bride, which is an

Figure 4: Experiment environment of the second scene for real-time PCD streaming: (a) arrangement of physical objects and RGBD camera, and (b) user’s viewpoints for motion parallax using SMV 3D display.

Webized 3D content streaming system for autostereoscopic 3D displays action figure in this scene, is wearing. After two seconds, the dress image disappears and a description image of the folding fan the bride holds on her hand as shown in Figure 6(b). At four seconds, the scene is changed from the second cut to the third cut as shown in Figure 6(c). The webized content synchronizer in the 3D content renderer changes from the first scene to the second scene, as shown in Figure 6(d) at five-second. The second scene (from five seconds

Web3D’2017, Brisbane, Australia to twenty seconds) as shown in Figure 6(d) to (f), which is the sushi with ingredient descriptions, is rendered. As time goes by, a selected sushi and ingredient description are changed. In Figure 6(d), the leftmost sushi is selected by the arrow 3D model and its description is shown. After three seconds, the arrow moves to next sushi as shown in Figure 6(e). The description of the first sushi disappears and new description, of the second sushi, appears. In the same manner, the position of the arrow

Figure 6: Webized SMV content streaming of different viewpoints using a super multi-view display: (a) the first scene is shown when the content is loaded, (b) and (c) the scene description is changed, (d) the scene is changed from the first scene to second scene, (e), and (f) the arrow moves to the next object and a new description is shown based on the timeline.

Web3D’2017, Brisbane, Australia and a description of the selected sushi is changed when the streaming time is eleven-second as shown in Figure 6(f).

5

CONCLUSIONS

In this paper, we propose a webizing method for SMV 3D streaming content on the SMV 3D streaming system on the 3D web. The proposed method presents SMV 3D streaming content such as real world objects, 3D models, and HTML DOM elements by HTML documents. In addition, the method uses CSS extensions for transition and animation effects based on the timeline. We present typical cases of SMV 3D streaming contents on the 3D web using a prototype implementation to verify the usefulness of our approach. We are planning future work as follows. To improve real-world 3D image and depth information quality, we will apply depth-based image rendering (DBIR) techniques. We will also add a camera control trajectory to provide views at different viewpoints and bi-directional SMV 3D streaming content to communicate each peer considering future 3D TV content.

ACKNOWLEDGMENTS This work was supported by the National Research Council of Science & Technology (NST) grant by the Korea government (MSIP) (No. CMP-16-01-KIST).

REFERENCES Behr, J., Eschler, P., Jung, Y. and Zöllner, M. 2009. X3DOM: a DOM-based HTML5/X3D integration model. In Proceedings of the International Conference on 3D Web Technology, Darmstadt, Germany, Jun 16-17 2009 ACM, 127-135. Birtles, B., Stephens, S., Danilo, A. and Atkins, T. 2013. Web Animations World Wide Web Consortium. Bulterman, D., Jansen, J., Cesar, P., Mullender, S., Hyche, E., DeMeglio, M., Quint, J., Kawamura, H., Weck, D., Pañeda, X.G., Melendi, D., Cruz-Lara, S., Hanclik, M., Zucker, D.F., Michel, T. and W3C 2008. Synchronized Multimedia Integration Language (SMIL 3.0) World Wide Web Consortium. Fraser, S., Jackson, D., O’Connor, E. and Schulze, D. 2013. CSS Transforms Module Level 1 World Wide Web Consortium. Jackson, D. and Gilbert, J. 2016. WebGL 2 Specification Khronos Group. Jackson, D., Hyatt, D., Marrin, C. and Baron, L.D. 2012. CSS Transitions World Wide Web Consortium. Jackson, D., Hyatt, D., Marrin, C., Galineau, S. and Baron, L.D. 2012. CSS Animations World Wide Web Consortium. Kooima, R.L., Peterka, T., Girado, J.I., Ge, J., Sandin, D.J. and DeFant, T.A. 2007. A GPU Sub-pixel Algorithm for Autostereoscopic Virtual Reality. In Proceedings of the IEEE Virtual Reality Conference, Charlotte, NC, USA, Mar10-14 2007 IEEE, 131-137. Müller, K., Merkle, P. and Wiegand, T. 2011. 3-D video representation using depth maps. Proceedings of the IEEE 99, 643-656. Rusu, R.B. and Cousins, S. 2011. 3D is here: Point Cloud Library (PCL). In Proceedings of the International Conference on Robotics and Automation, Shanghai, China, May 9-13 2011 IEEE, 1-4. Seo, D., Yoo, B., Choi, J. and Ko, H. 2016. Webizing 3D contents for super multiview autostereoscopic displays with varying display profiles. In Proceedings of the International Conference on Web3D Technology, Anaheim, CA, USA, Jul 22-24 2016 ACM, 155-163. Seo, D., Yoo, B. and Ko, H. 2015. Webized 3D experience by HTML5 annotation in 3D web. In Proceedings of the International Conference on 3D Web Technology, Heraklion, Crete, Greece, Jun 18-21 2015 ACM, 73-80. Sons, K., Klein, F., Rubinstein, D., Byelozyorov, S. and Slusallek, P. 2010. XML3D: interactive 3D graphics for the web. In Proceedings of the International Conference on Web 3D Technology, Los Angeles, CA, USA, Jul 24-25 2010 ACM, 175-184. Wikipedia, Motion JPEG [online]. https://en.wikipedia.org/wiki/Motion_JPEG [Accessed Feb 11 2017].

D. Seo et al.

Suggest Documents