Immersive Authoring of Tangible Augmented ... - Semantic Scholar

4 downloads 162293 Views 3MB Size Report
directly within the target executable environment, the developer gains a better ...... [14] HITLabNZ Projects, 2007. www.hitlabnz.org/index.php?page=projects.
Immersive Authoring of Tangible Augmented Reality Content: A User Study

Gun A. Lee Dept. of Computer Science and Engineering Pohang University of Science and Technology (POSTECH) Pohang, Korea [email protected]

Gerard J. Kim (Corresponding Author) Dept. of Computer Science and Engineering College of Information and Communication Korea University Seoul, Korea Tel: 82 2 3290 3196 / Fax: 82 2 3290 4295 [email protected]

Abstract Immersive authoring refers to the style of programming or developing content from within the target executable environment.

Immersive authoring is important for fields such as augmented reality (AR) in

which interaction usability and user perception of the target content must be checked first hand, in situ. In addition, the interaction efficiency and usability of the authoring tools itself is equally important for ease of authoring.

In this paper, we propose design principles and describe an implementation of an

immersive authoring system for AR. its benefits and weaknesses.

More importantly, we present a formal user study demonstrating

In particular, our results demonstrate that, compared to using the traditional

2D desktop development method, immersive authoring gained significant efficiency in specifying spatial arrangements and behavior tasks, a major component of AR content authoring. successful for abstract tasks such as logical programming.

However, it was not so

Based on this result, we suggest that a

comprehensive AR authoring tool should include such immersive authoring functionality to help, particularly non-technical media artists, create effective contents based on the characteristics of the underlying media and interaction style.

Keywords: Immersive Authoring, Augmented Reality, Tangible Interface, User Study, Interaction Design.

1. Introduction Augmented reality (AR) is a newly emerging type of digital content that combines real imagery (usually captured by video cameras) with virtual 3D graphic objects.

Thus, its content is 3D by nature.

Compared to 2D oriented content or applications for which a stable interaction platform exists, developing 3D content (such as AR content) requires careful a consideration of interaction usability and user perception, in addition to the basic functionality.

Immersive authoring has been proposed in the

virtual reality community as one way to achieve this objective [35].

Immersive authoring refers to the

style of programming or developing content from within the target executable environment.

By working

directly within the target executable environment, the developer gains a better “sense” (since a full blown formal usability test is not always feasible) for the content in development as seen, used, or felt by the user.

Note that the executable environment of AR is quite different from that of the desktop, often

requiring the user to wear a head mounted display (HMD), camera, sensors and use non-traditional interfaces, a time consuming process in itself.

Thus, immersive authoring has the additional benefit of

reducing the period between contents development and testing/deployment.

Immersive authoring is similar to the concept of “What You See Is What You Get (WYSIWYG),” the most prevalent form of visual authoring tool today [7].

While the concept of WYSIWYG is quite

intuitive and its benefits have been attested to in theory and practice for some time, immersive authoring is still just an interesting proposal, without benefits which have been formally demonstrated or verified. This is partly because the efficiency of immersive authoring depends on its own interaction usability and ease of use.

Despite potential benefits, any authoring tools, immersive or not, will neither be effective

nor gain popularity if they are difficult to use.

In addition, one must also consider that some aspect of

the immersive content may not be achieved in the most efficient way through immersive interfaces (e.g. specifying logical behavior).

Such issues need to be examined in conjunction with each other.

In this paper, we first propose requirements, particularly in terms of its interaction design, for immersive authoring for AR content.

Then, we briefly describe our implementation.

Our central concept of

immersive authoring for AR is an extension of the “WYSIWYG” into “WYXIWYG (What You eXperience Is What You Get)” [22].

We demonstrate the projected benefits of immersive authoring by

conducting an interaction usability test with AR authoring tasks, as compared to using the traditional desktop development method.

In the following, we first review some previous research related to our study.

We then discuss the

design principles and requirements for immersive authoring in terms of interaction usability. briefly describe our implementation of an immersive authoring system called iaTAR.

We also

Section 4

describes the formal usability test we have performed to evaluate our immersive authoring system. Finally, we conclude the paper with an executive summary of our contribution and future research directions.

2. Related Work The basic idea of immersive authoring has been known for some time, although mainly in the context of virtual reality (VR).

Stiles et al. proposed a conceptual VR authoring system called the “Lingua

Graphica” in which various elements of the programming language were represented in a concrete manner using 3D objects [35].

Several researchers applied immersive VR (rather than using the

conventional desktop CAD systems) to create 3D virtual worlds [5][23][25][37].

Similar attempts have

been made in the AR area also, for example, to construct virtual scenes from within AR environments. Poupyrev et al. [29] suggested a rapid prototyping tool for modeling virtual aircraft cockpits. system provided a set of virtual gauges and instruments that can be copied over physical tiles. were able to test various layouts using an AR interface.

The The users

Kato et al. [15] suggested a generic interaction

method for manipulating virtual objects within AR environments, and applied it to the task of arranging furniture in a virtual room. outdoor AR systems.

Piekarski and Thomas [28] suggested 3D geometry modeling techniques for

The modeling system was for constructing virtual representations of physical

landmarks while roaming and examining the outdoor scene.

All of these works, however, fell short of

being true authoring tools, as they did not consider object behaviors.

On the other hand, “immersive” behavioral modeling has not attracted the same degree of attention, perhaps due to the seemingly logical nature of the task. VR/AR platforms.

Thus, it is not considered fit for 3D immersive

However, many object behaviors can be both logical and spatial at the same time.

Although geometric modeling is an important part of an authoring process, in this paper we concentrate on the task of scene configuration, object behavior modeling, and other types of functionality for authoring support.

Few others have considered immersive authoring of object behaviors in the manner

of Steed et al. [34] and Lee et al. [22].

All these systems explored defining behaviors of the scene and

objects within the virtual environment using VR interfaces.

For example, in the system by Steed et al.,

the users were able to view and manipulate the links (i.e. data flow) between virtual objects. one of the first systems implemented for immersive behavior modeling.

This was

However, the data flow

representation was not general enough to accommodate various types of behaviors that were possible in a typical VR system, and there was arguably no compelling reason or advantage (other than merging the executable and development platform) to employ the 3D interaction or immersive environment to view and interact with the data flow representation.

Most importantly, it was difficult to judge the overall

comparative benefits of immersive authoring from these works without any formal user studies.

Augmented reality (AR) content has been developed mostly by programming with specialized toolkits (API’s) [1][26][31].

As a possible means of relieving the burden of low-level programming, few

researchers have proposed the use of abstract mark-up languages and visual tools for specifying AR content [18][38]. With the recent popularity and interest in AR, more comprehensive AR authoring tools have been developed extending this approach [2][8][9][10][24]. These tools typically offer a desktop based GUI interface with various representation constructs (e.g. data flow, state diagrams, geometry), and an executable window showing the evolving AR content (see Figure 1). camera, usually fixed, is monitoring the target interaction area. AR authoring tool developed under the AMIRE project [9].

Note that in this situation, a

For example, CATOMIR is a desktop

Its graphical user interface enables users to

create and specify properties of required components and link them to create behavior chains.

Users can

immediately switch to an executable mode (simply by pressing the compile button) for running and testing the result (simply looking at the AR content window, see Figure 1).

DART (The Designers

Augmented Reality Toolkit) is also a 2D desktop tool for rapid development and prototyping of AR applications [24].

DART is implemented as an upper layer of Macromedia Director, leveraging its

familiar behavior modeling method using scores, sprites and scripts.

While development is still based on a 2D desktop and indirect (not immersive) method, the use of markup languages or GUI based tools does significantly reduce the development time.

That is, the AR

content being designed can be immediately compiled, executed and displayed using the desktop window. While the view of the content is neither first person nor immersive, the development time is still significantly reduced compared to traditional programming.

We posited (and found) that experts (who

are used to AR programming) both prefer and are more productive using mark-up languages, while 2D GUI tools are more popular among novices (e.g. non-programmers, those not familiar with AR programming or the syntax of the mark-up language) [33]. However, in both cases, the problem remains the excessive trial-and-error process resulting from the indirect specifications of the spatial aspects of the content.

Moreover, for mobile and location-dependent AR content (i.e. AR content tied to a wide

physical area and displayed using a hand-held or wearable device), desktop authoring becomes even more indirect and difficult [10],

In fact, authoring AR content can be classified into three main types of tasks, (1) spatially oriented (3D interface needed / e.g. 3D placement/association of objects), (2) logically oriented / discrete commands ( 2D interface is sufficient / e.g. logical behaviors), and (3) both (1) and (2) coupled (e.g. spatial behaviors).

Thus, the effectiveness and design of an authoring tool, 2D oriented or 3D immersive, must

be based on the nature of the particular target content (e.g. what proportion of the target content constitutes spatial behavior?).

Our study focuses on investigating the benefits and weaknesses of 3D

and immersive authoring for AR through a formal experiment using representative authoring tasks.

3. Immersive Authoring system for Tangible Augmented Reality (iaTAR) Immersive authoring allows a content developer to experience and verify the content first hand, while creating it through natural and direct interaction within the same environment as the one where the final result is used, in this case the AR environment.

By definition, in immersive authoring, the development

process occurs within the same environment as the one where the content will be experienced by the end users; therefore developers need not switch between the authoring and test environments.

Likewise, by the term, “AR authoring,” we mean a process of creating and organizing AR content by using interactive software (primarily designed for non-programmers), in contrast to programming libraries or application programming interfaces (API’s).

More specifically, the authoring system

described in this paper is targeted at creating Tangible Augmented Reality (TAR) content.

TAR

interfaces [11] are those in which (1) each virtual object is registered with a physical object and (2) the user interacts with virtual objects by manipulating the corresponding physical object.

TAR

applications/content have recently become quite popular.

MagicBook is one such example [3].

MagicBook is an AR based pop-up book for which users can watch 3D animations popping out from a physical story book (See Figure 2 and 3).

3.1 Requirements for Immersive TAR Authoring Based on the previous definition and our specific target application/content, we established four general requirements for a TAR immersive authoring to help guide our implementation. requirements will apply to immersive authoring systems in general.

We believe that our

To describe the first and most basic

fundamental design principle of immersive authoring systems, we have coined a new term “WYXIWYG,” which stands for “What You eXperience Is What You Get” [22].

Like the term “WYSIWYG (What You

See Is What You Get)” [7] in modern graphical user interfaces, this design principle implies that an immersive authoring system must support fast and intuitive evaluation of virtual world being built.

In

other words, developers must be able to experience the same feeling or aura (not only visual and aural, but even tactile or haptic) the end-users might feel, with the content that is under development.

This

provides an instant (or even concurrent) evaluation of the content under construction, helping the developers to accurately understand the current status of the content.

Ideally, any interactive and

immersive content should undergo formal testing of its usability, level of presence, and degree of information transfer.

In practice, this is difficult, for reasons of cost, time and limited resources.

The

next best alternative would be enabling the developer to get the feel of the content as quickly and easily as possible.

This would enable the developer to experience the user’s perspective, and, ensure that the

developer’s intentions are truly reflected.

The next design principle is to employ direct manipulation techniques in the authoring process as much as possible.

Direct manipulation is another important concept borrowed from the 2D user interface

development field [33].

This refers to manipulating graphical objects directly.

Since its introduction,

along with the mouse, it has revolutionized and transformed the way we interact with computers, particularly for creating graphically oriented content [33].

Similarly, since the immersive authoring

environment uses three-dimensional interfaces by its nature, direct 3D manipulation should provide an

efficient and intuitive way of manipulating virtual objects (one of the main subtasks in authoring 3D TAR content).

Providing directness and tactility will increase the intimacy between the developer and the

content [33].

While direct manipulation is undoubtedly intuitive, it lacks sufficient spatial accuracy to support fine and precise modeling tasks.

This is because the exact 3D positioning/orienting task (with 6 degrees of

freedom) is difficult by itself, and also because the current movement tracking sensors lack the required accuracy.

A separate provision (such as constrained positioning and alphanumeric input) must be made

to ensure sufficient detail control is possible and support a reasonable range of modeling functionality..

3.2 Task Analysis and Interaction Design for iaTAR

Suppose we want to construct interactive TAR based content for the following simple story (“The Hare and Tortoise”) from the Aesop’s fables.

A Hare one day ridiculed the short feet and slow pace of the Tortoise, who replied, laughing, "Though you be swift as the wind, I will beat you in a race."

The Hare, believing her assertion

to be simply impossible, assented to the proposal; and they agreed that the Fox should choose the course and fix the goal.

On the day appointed for the race the two started together.

The

Tortoise never for a moment stopped, but went on with a slow but steady pace straight to the end of the course.

The Hare, lying down by the wayside, fell fast asleep.

At last waking up,

and moving as fast as he could, he saw the Tortoise had reached the goal, and was comfortably dozing after her fatigue.

Slow but steady wins the race.

To realize this story as a TAR content, several types of functionality will be required (this can be implemented in many ways).

The objects must be modeled (i.e. geometric shape and configuration)

according to the details required by their functions, which in turn must also be modeled.

For instance,

the hare’s running requires modeling of its legs and a periodic animation/sound associated with this.

Then, various scenes must be put in place (e.g. start of the race, Hare’s sleeping scene, etc.).

Specific

details need to be specified for the object’s behavior, such as its timing, motion profiles, conditions, triggering events, etc.

Note that the manifestation of the behavior may require a careful selection of

multimodal output for the best effect.

To make the story interactive, 2D or 3D interaction may be

designed and inserted (see Figure 2).

The content developer can insert special effects, sound tracks and

changing lighting conditions into the behavioral time line.

All of these important modeling and

specification tasks may be repeated and rehearsed, as the content develops and matures, during which the developer will take notes, adjust parameters, try different versions, replay and review, check for usability and immersion effects, and even act out the object’s role oneself.

The developer will also constantly

require various types of information to make decisions and perform these tasks.

Table 1 summarizes some of the important subtasks required for immersive authoring.

The four major

tasks are broadly identified as individual object specification, overall scene specification, content review (or executable) and miscellaneous operations, which are further decomposed into various subtasks. Table 2 matches each subtask to among four forms of authoring interfaces, e.g. specialized tools, programming or text based input, 2D GUI, and 3D immersive authoring.

A more detailed explanation

follows.

3.2.1 Object Specification

One of the major tasks in object specification is geometric modeling as part of the specification of its form.

Although there are benefits to modeling the appearance of the virtual objects within the scene

using 3D/AR interfaces, this is outside the scope of the paper, as there has been substantial previous work in this area [5][4][23][37].

Instead, we assume that there already exist geometric models of the virtual

objects for the developer to import, use, modify, manipulate and specify gross behavior for.

Detailed

geometric/graphic modeling and animation of objects are often accomplished separately using specialized tools.

Likewise, we can also assume that the virtual objects come with few “basic” types of

functionalities already modeled, so that the developer does not have to bother specifying trivial details, although they can be specified using a separate function specification interface, if needed.

Otherwise,

“immersive” specification of form mainly involves small geometric adjustments (e.g. slight reconfiguration of the object organization, size adjustment, etc.) and assigning values for the object’s relevant attributes.

Whether the attribute values are discrete (e.g. color), continuous (e.g. dimension), or

spatial (e.g. 2D or 3D) may dictate the required style of the interface.

Particular types of virtual objects (e.g. articulated characters, moving objects, buildings, etc.) commonly used by a given application or content type can be identified through a domain analysis.

Such a basic

object model can free the user from needing to define frequently used attributes or functions during the authoring process.

When definition of a new attribute or function is required, the content developer can

resort to an alphanumeric input interface for typing identifiers or script codes.

The application of high level authoring tools is deemed more appropriate for specifying complex object behaviors in the context of the overall story and scene.

Current methods rely mostly on programming.

While there may be cases where programming will be necessary, much of the behavioral authoring can be made easy by the use of structured behavior “models.”

For example, the Virtual Reality Markup

Language [36] uses the notion of “routes” to define behaviors.

Object attributes can be connected by the

routes to define dynamic behaviors, forming a data flow graph of virtual objects. state based methods and “drag and drop” scripts are possible also [16][27]. can be considered to be specification of a chosen behavior model.

Other methods such as

Thus, describing a behavior

Based on the chosen behavior model,

the content developer might use different interaction methods, 2D or 3D/immersive.

For instance,

behavior specification using “routes” (making logical connections) or importing reusable behaviors can be accomplished using 2D interfaces.

On the other hand, when using event-driven imperative models,

the events and actions (e.g. sequence of functions) can be represented with metaphorical virtual objects because virtual object behaviors frequently exhibit spatial characteristics. dimensional direct manipulation and demonstration can be more useful.

In this case, threeA behavior specification can

even involve “spatially acting out” specific situations to encapsulate an event or behavior, e.g. a colliding motion or coordination of timing by directly manipulating the involved virtual objects.

User interaction

can be considered as the means by which individual objects “behave” based on user input. user interaction behavior may be 2D or 3D in nature.

Again, this

Finally, an authoring subtask unique in TAR

applications is the “binding” task for associating a virtual object to a physical prop or real object in the

AR environment.

3.2.2 Scene Specification

Virtual objects collectively constitute a scene and the scene changes according to user input and object behavior.

Scenes are constructed by importing the needed objects/props into the (real) scene, and

positioning and orienting them.

In a conventional development platform, this is accomplished through

trial and error (recompile, display and review), by guessing the translation and rotation offsets of the objects with respect to the parent coordinate system. aid in making this process more intuitive and efficient.

Immersive 3D direct manipulation of objects can Studies have revealed human difficulty in

specifying 3D spatial information without a strongly established reference, when having to resort to mental simulation, or with 2D interfaces (i.e. keyboard and mouse) [30].

An AR scene can also be considered as an individual object with its own attributes (e.g. shading properties, background sounds) and behavior (e.g. scene switch, changing weather effects).

Also, note

that there may be computational and “formless” objects such as the camera and collision detection modules which have features or behavior that can affect the whole scene.

They can also be specified

through interfaces similar to those used for object attribute specification, and their attribute types will dictate the styles of the required interfaces.

3.2.3 Content Review / Miscellaneous

As has been already pointed out, an authoring task is basically an exploration task with much trial and error. Thus, aside from the above two major authoring tasks, the authors need to execute and review the authored content.

Note that the author may wish to deploy and review only parts of the content rather

than the content in its entirety.

During this process, the author will save different versions of the virtual

world/objects for later reuse/replay. merged to define a new version.

AR worlds or objects saved at different times may have to be

Managing these operations can be accomplished easily through 2D

interfaces.

During the review, the author should be able to navigate throughout the scene and view the content first hand from various view points to assess the level of immersion and interaction usability.

Thus, such a

review has to be performed in the immersive executable mode.

3.2.4 Examples: Hare and Tortoise, ARVolcano [14], Circulation of Water [32], Human Pacman [6]

As summarized previously, AR authoring involves various types of subtasks. different subtasks will vary according to the particular content.

The composition of the

We use four different examples to

further illustrate the need for immersive authoring, and its unique role.

The first is the interactive story

of Hare and Tortoise (see Figure 2) in which the dominant tasks are the scene configuration and creation of the coordinated behaviors among the main characters, both of which are best accomplished using 3D immersive interfaces.

The second is the AR Volcano (see bottom left part of Figure 3), a typical

MagicBook application [14].

In this type of AR content, logical/2D authoring tasks consist of binding a

particular marker (e.g. a page in a book)with previously modeled virtual objects/behaviors (e.g. exploding volcano) and specifying simple 2D interaction (e.g. controlling the magnitude of an explosion and molten lava using the slide bar on the side). of circulation of water [32].

The third is AR based educational content for teaching the concept

This content involves many metaphorical 3D interactions, e.g. circulation

of the marker (e.g. river – mid air – cloud – mid air – underground) along with which particular events (illustration of vaporization, condensation, formation of clouds, etc.) occur, requiring the manipulation of water molecules to make rain drops (see top row of Figure 3).

This requires exploration for the best

possible way to interact, in situ, to communicate the scientific concept to students. seems most appropriate for this case. Mixed Reality Pacman [6].

Immersive authoring

The final example is the case of mobile AR content, called the

Placing virtual balls (for the human Pacman and monsters to eat) at

particular locations in a wide area undoubtedly requires immersive authoring capabilities.

The four

examples demonstrate the practical needs and advantages of immersive authoring (derived from the particular needs of AR content) in addition to the standard 2D GUI.

3.3 Interface and Implementation Our prototype implementation of the immersive authoring tool for AR content is named iaTAR (immersive authoring for Tangible Augmented Reality).

The tool especially focuses on creating TAR

content, where physical props work as a medium for controlling virtual objects.

Specific interfaces were

chosen for the variety of subtasks described in Section 3.2 (Table 3) based on our analysis (Table 2). Among the various subtasks, 3D spatial operations are performed by directly manipulating the props. For instance, the subtask “Object Placement/Rotation” is realized in such a way.

Logical tasks / discrete

commands (e.g. changing a color of an object, opting to run the content) are realized through a 2D GUI associated with various props.

For this purpose, virtual button functionality was partially implemented

using an approach called the occlusion based interaction (OBI) [20].

In OBI, the visual occlusion of

markers (by the interacting fingers) is used to detect and signal a virtual button.

The basic principle can

be applied to design a variety of 2D tangible interfaces, such as menus, buttons, slider bars, and keypad. Finally, although rare, there are instances when the use of the keyboard is necessary for alphanumeric input.

Note that navigation in the AR environment (for inspection or changing the view point) is simply

accomplished by motion of the user’s own body (or head) in real space.

Figure 5 to 8 illustrates various

examples of authoring task interactions.

When the user finishes the authoring tasks through these interfaces, in order to execute the authored result, the input data is interpreted according to an underlying executable model based on a structured mark-up language called the TARML.

Thus, the authored content can be saved in the TARML format, too1.

TARML is very similar to VRML in terms of how it defines objects, their properties and values, and means of connecting them for propagating events and simulating behaviors (like routes in VRML) [36]. TARML differs from the VRML in two ways: (1) by providing a way to associate physical props with objects for tangible interaction and (2) by supplying many predefined classes such as “logical” (e.g. for various logical behavior specification) and “sound” (e.g. for sound effects) objects for modeling

1

As this paper focuses on the interaction design for immersive authoring and its validation, we give only

a brief explanation of the details of the execution model and TARML.

convenience.

Different types of objects (e.g. physical prop, virtual, logical, and sound) are specified

using “tags.”

For instance, the pobject tags represent physical objects (i.e. markers) and vobject tags,

virtual objects.

Likewise, object properties and behavioral chains can be specified using the property

and link tags respectively.

Figure 4 shows an example of AR content represented in TARML.

The

content shows a rotating virtual fish on a physical card (prop), making bubble sounds when the card is visible.

For richer, more flexible and more convenient content specification, more object and behavioral

models should be added to TARML in the future.

Nevertheless, content specification with the current

form of TARML is conceptually quite easy, especially for those with programming or engineering backgrounds.

iaTAR is implemented using a vision based 3D tracking library called the ARToolKit [1] (used for calculating the 3D position and orientation of the visual markers/props) and OpenGL for the graphics part. A plain USB web camera from Logitech is used to acquire video images of the real environment and for the tracking. The capture resolution is set to 320x240 and the shutter speed is 30 frames per second. The camera is mounted on a head mounted display (HMD) to provide a real world view to the user, forming a video see-through AR configuration.

A keyboard is used for alphanumeric input (See Figure 9 and 10).

iaTAR runs on a PC with the MS Windows XP operating system on a Pentium 4 processor with 1GB main memory. A GeForce4 3D graphics card from NVIDIA is used to accelerate the OpenGL graphics processing.

4. User Experiment

The main purpose of this study is both to propose interaction and interfaces for immersive authoring, and to demonstrate its advantages (by comparison to the conventional desktop methods), for instance in terms of authoring efficiency (e.g. ease of learning and authoring time) and resulting content quality (e.g. user felt presence).

However, such a validation is difficult because in performing a comparative experiment

and analysis, a representative task or subject neutral to a particular authoring method is hard to find. Ideally, a particular approach (like immersive authoring that has multifaceted factors) should be informally validated by “word of mouth” after it has been tested for various types of content over a long

period of time.

The resulting content quality is hard to measure also, because of the subjectivity in

human perception.

Despite these difficulties, we conducted a usability experiment comparing the

proposed immersive authoring to desktop authoring, in the hopes of deriving a general principle.

In this paper, our focus is the authoring efficiency, such as ease of use and authoring time (rather than the content quality which is rather subjective), in comparison to the standard desktop method.

Specifically,

we compared iaTAR to an XML based scripting (which we named TARML), for authoring content with given specifications in a controlled setting.

Our main hypothesis was that the immersive authoring

would be significantly more efficient, with less overall authoring time (particularly for tasks with spatial impact), and be more natural and easy to use compared to the desktop counterpart.

Aside from the

merely representational differences, existing 2D GUI tools were not directly compared to iaTAR, because as we noted previously, with a sufficient amount of training, direct editing can often be more efficient than using 2D GUI tools [33], and informal direct comparisons have partially demonstrated similar results [21]. To put it more precisely, the followings are the three hypotheses we are investigating in this experiment.

Hypothesis I. Immersive authoring method (i.e., iaTAR) is significantly more efficient in authoring AR contents than conventional development methods based on programming languages (i.e. TARML).

Hypothesis II. Immersive authoring is significantly more efficient on spatial authoring tasks in than on non-spatial tasks.

Hypothesis III. Immersive authoring method (i.e. iaTAR) is significantly easier to use and learn in comparison to conventional development methods based on programming languages (i.e. TARML).

4.1 Experimental design and task

The experiment was designed as a one factor within-subject experiment. the type of the authoring system used (iaTAR or TARML).

The independent variable was

iaTAR represented the use of immersive

authoring, and TARML represented the use of a desktop text editing based method (i.e. typing in the

required scripts/constructs according to the TARML syntactic format). were the task completion time and the usability survey answers.

The major dependent variables

The task assigned to the subject was to

construct content using the particular authoring system that satisfied a given set of requirements. The subjects were given as much time to finish the task as needed, but were instructed to complete the task as quickly and as accurately as possible.

4.2 Experimental procedure

The experimental procedure for a participant consisted of two sessions: an hour for the training session and one and a half hours for the experimental session.

During the training session, participants learned

the basic concepts of TAR content and the respective authoring methods using iaTAR and TARML. Note that the syntax for the TARML scripts (at least for the experimental task) was sufficiently simple and easy (especially for our subject group with participants who had engineering backgrounds, also see Section 4.3).

In addition to the detailed briefing, the subjects practiced with each authoring tool by

constructing a sample TAR content.

In the actual experimental session (which followed the training session after a short break), the subjects had another practice trial of authoring TAR content in order to help subjects recall the authoring methods after the break.

After the practice trial, the subjects were asked to author six different types of TAR

content according to a given specification (two examples are given in Table 4) using both authoring methods.

The order of the authoring methods and the six types of contents for each participant was

counter-balanced.

The overall experimental process is shown in Table 5.

During the experimental session, to prevent any possibility of the subjects being unduly guided by the experimenters in any way, the subjects were not allowed to ask questions about how to use the authoring interfaces.

Instead, the participants were allowed to freely use the user guide document for reference.

Only when the subject got “lost” (e.g. not knowing what to do even after looking up the user guide) and it was determined that they had spent too much time on a particular subtask, e.g. more than one minute, was the experimenter allowed to help the subject.

In most cases, users knew exactly what to do, but had

forgotten the exact syntax (for TARML editing) or how to perform certain operations (for immersive

authoring).

One minute was deemed approximately the right amount of time, based on our prior

observation, to resolve such a problem in authoring. from the task completion time.

Any extra time spent (due to help) was subtracted

This way, a provision was made so that the overall performance data

was not severely biased by a few such outliers (there were two users, getting lost two times each, both during TARML editing, see Section 4.4).

The requirement specification sheet was provided and explained by the experimenter at the beginning of each trial.

The subject was asked to build the content described in the requirement specification sheet

as quickly and accurately as possible.

The subjects were allowed to refer to the requirement

specification sheet and to ask questions to the experimenter about it whenever needed.

In addition, the

experimenter also periodically read out the requirements for the participant to remind the subject what to do next.

Such a procedure was needed because when using the immersive authoring method, the

subjects, wearing the head mounted display (HMD), sometimes could not read the specification text displayed using the display, due to the low resolution (this problem could have easily been overcome by increasing the text size or using a higher resolution HMD).

There were six experimental trials, each with a different requirement specification sheet.

In each trial,

participants used both immersive and script editing authoring methods, in order to build the same content twice, with each authoring interface. non-spatial tasks.

Three trials included only spatial tasks, the other three trials only

In the trials with spatial tasks, the participants were to build an AR scene (see Table 4)

positioning and orienting four virtual objects, and scaling one of them.

To help the subjects make use of

their spatial perception, spatial properties (positions, orientations and scale factors) were described in a relative manner (not by specific values or numbers).

In addition, a sample picture of the final scene was

shown to help the subject understand and remember the scene to be constructed. spatial tasks, the subject began with a particular pre-modeled virtual scene.

In the trials with non-

Using the virtual objects in

the particular virtual scene, the subjects were asked to make data-flow links and to change their properties to a specific value.

Each specification included six data-flow links and one property value to be set (see

Table 4).

To measure the user performance, task completion time was recorded for each trial, for each authoring

method.

The number of times the participant referred to the user guide and the number of times the

participant got lost with the interface were counted also.

The period of time the subject was lost

(looking at the user guide for more than a minute and/or not knowing what to do) was subtracted from the task completion time, as already mentioned.

Subjective measurements were also collected with questionnaires at the end of the training and experimental session.

At the end of the training session, the participants were asked to rate how easy the

given authoring method was to learn.

At the end of the experimental session, they were asked to rate

how easy the given authoring method was to use, and how confident they were with the content they built using each authoring method.

Ratings were given on a Likert 7-point scale (0: very difficult/unconfident,

3: neutral, 6: very easy/confident).

Other subjective opinions, such as user preference and the strengths

and weaknesses of the authoring methods were also collected.

4.3 Experimental setup

The experimental environments for the script editing and immersive authoring are shown in Figure 10. A desktop computing environment (using a 2D display, keyboard and mouse) was provided for the script editing method.

Because the target content to be built was an AR based content, a camera was set up on

a fixture stand for testing the content script. if needed.

The user was allowed to change the camera location freely

For the immersive authoring configuration we used the iaTAR system.

The system

consisted of an HMD with a camera attached, to provide a video see-through functionality.

Tangible

props were used as authoring interfaces.

Twelve graduate/undergraduate students with engineering backgrounds participated in the experiment. The age of the participants ranged from 19 to 26 and they all had sufficient typing skill for editing the scripts (an average of 242 characters per minute). the computer science department. system development.

Half of the participants were taking the VR class at

Thus, they had brief (approximately 3 months) experience in VR

The other half did not have any experience in VR system development, nor in 3D

graphics programming (but they possessed general programming skills).

Table 6 summarizes the subject

statistics.

4.4 Experimental results

In order to investigate the Hypotheses I and II, we have conducted a series of one-way within subjects ANOVA for comparing users’ task performances between iaTAR and TARML.

All of the tests were

carried out with an alpha level of 0.05.

First, to validate the overall efficiency of the immersive authoring (Hypothesis I), the total task completion time for authoring all six types of contents using each method was compared to each other. The average total authoring time spent using iaTAR was 27 minutes 43 seconds, and for TARML, 38 minutes and 37 seconds.

The ANOVA revealed a statistically significant difference with F(1,11) = 37.20

and p < 0.0001 (see Table 7). comparison to TARML.

That represents about 28% of time saving when iaTAR is used in

According to this result, we can conclude that the Hypothesis I is valid.

In order to assess if iaTAR is indeed more efficient on spatial tasks (Hypothesis II), we compared the task performance according to the authoring task group: spatial and non-spatial. completion time between iaTAR and TARML, only with the spatial tasks.

First, we compared the task As expected, the iaTAR

clearly took shorter amount of time for the spatial authoring tasks (see Table 7). The total time spent for completing the spatial authoring tasks using TARML (25 minutes 12 seconds on average) was approximately twice as much as that of when using iaTAR (12 minutes 13 seconds on average), which was a significant difference under the ANOVA (F(1,11) = 99.94, p < 0.0001).

However, for non-spatial authoring tasks, iaTAR turned out to be inefficient compared to TARML (see Table 7).

With iaTAR, it took about 13% longer for subjects (mean = 15m 12s), to complete the non-

spatial tasks compared to TARML (mean = 13m 24s), and the difference was statistically significant (F(1,11) = 6.20, p = 0.0301). While iaTAR took longer with the non-spatial authoring tasks, the difference with TARML was not as large as that of the case with the spatial authoring task where iaTAR outperformed twice faster over TARML as mentioned earlier.

We discuss this result further in Section 5.

We also directly compared the task performances of iaTAR over TARML for each task group. Figure 11

shows the total tasks completion time according to the task groups in a graph.

The efficiency factor (E)

of iaTAR over TARML for a given task was defined in the following way:

E(task) = T(iaTAR, task) / T(TARML, task),

where T(x, y) was the time spent for completing task y with the authoring tool x. with the efficiency factor of each task group as the dependent variable.

We applied ANOVA

As a result, the efficiency factor

of iaTAR over TARML turned out to be significantly greater (F(1, 11) = 115.2647, p < 0.0001) for the spatial task (mean = 0.5060, SD = 0.1130) than for the non-spatial (mean = 1.1764, SD = 0.2461). Hence, we conclude that the Hypothesis II is valid as well.

To assess the usability (i.e. ease of use and learning, namely the Hypothesis III), we compared the subjective ratings collected with the questionnaire asking how easy iaTAR or TARML was to learn and use.

Based on the results of subjective ratings (see Table 8), no statistically significant differences were

found between iaTAR and TARML (α = 0.05).

For instance, iaTAR was only marginally easier, strictly

speaking, to learn than TARML, with the p-value = 0.067. However, according to the results of debriefing, this appears to be mainly because of the relatively low quality of the device and sensing accuracy (see Table 9 and 10) rather than from the method itself.

On the other hand, while the syntax of

the language might have taken the users some time to learn, the interface itself, standard 2D editing, was already quite a familiar method to most users.

For further assess the Hypothesis III, we compared the number of times subjects referred to the user guide document and the number of times subjects got lost during the authoring task.

Subjects referred to the

user guide document 3.9 times on average (SD=3.55) when using TARML, whereas they never needed to do so for iaTAR.

Subjects with VR backgrounds only used the guide 1.8 times (SD=1.33) on average,

whereas those with no VR background used it 6 times (SD=3.95).

This demonstrates the obvious fact

that there is a mental skill factor involved with using the conventional programming approach.

It is

desirable to eliminate such an obstacle as much as possible for non-technical content developers such as artists.

Another indication of the authoring efficiency or ease of use can be observed from the number of times

the subjects got lost when using the respective authoring methods. class) each got lost two times when using TARML.

Two users (both not from the VR

They reported that they had gotten lost due to the

confusion with the TARML grammar and problems with perceiving rotations in the three-dimensional space.

Table 11 and Figure 12 show that the average task completion time per trial according to the

authoring methods for each subject group.

No statistically significant difference (α = 0.05) was found

between the subject groups with regards to the task performance for both spatial and non-spatial authoring tasks.

Thus, this result is another indication that iaTAR is easy enough to learn and use for those without

background knowledge on VR technologies and 3D authoring tasks.

Besides, with non-spatial tasks, a

statistically significant interaction between the subject group and authoring methods was found (F(1, 10) = 16.37, p = 0.0023).

From this, we can posit that the background knowledge on VR gave an advantage

when using the conventional authoring method (i.e. TARML).

We believe that this resulted from their

prior expertise and experience in content development, especially dealing with 3D transformations and describing them with a computer language (such as scripts).

And although it did not appear statistically

significant (F(1,10) = 4.45, p = 0.0611), a similar trend was found with the spatial tasks as well.

5. Discussion

According to the results of the user experiment, we can conclude that there are significant evidences for the Hypotheses I and II.

That is, using the immersive authoring tool was efficient in comparison to

using TARML, with approximately 30% reduction in time to complete the given authoring task.

This

was due to the fact that immersive authoring yielded twice the performance in comparison to the conventional authoring method for spatial tasks, while exhibiting comparable performance for a nonspatial task.

Also note that, in an actual use of the conventional desktop authoring method, the user

would have to change to an “execution” mode and “wear” the system (e.g. wear the HMD and the associated sensors) to test the resulting content, which is a time consuming and bothersome process. Such a factor was not even included in assessing the non-immersive authoring time.

Thus, it can be

predicted that the overall authoring and testing time will be much shorter with immersive authoring, even for the non-spatial tasks.

Finally, iaTAR demonstrated no marked difference in ease of learning and use in comparison with TARML editing, which employs one of the most familiar interfaces, 2D editing (a similar argument can be extended for 2D GUI also).

Although the results of subjective ratings demonstrated no statistical

difference between the two, the number of times participants referred to the user guide document, and the number of times participants got lost when reading the user guide, provides indications of higher usability for iaTAR even for non-logical tasks.

Noting that the most inconvenience in using the iaTAR was

caused by imperfect devices (e.g. wired sensors, low resolution HMD), we anticipate that the usability of iaTAR will be significantly improved by higher quality equipments.

Subject debriefing also reflected the preceding analysis results.

Participants were asked to freely write

down their opinions on the strengths and weaknesses of each method. summarized in Table 9-10.

The reported opinions are

Although these informal results do not have any statistical significance, they

still raise enough important issues, in the context of the quantitative results, to be considered when designing and evaluating immersive authoring systems.

Most subjects reported that the ease of

configuring objects in 3D space was the foremost merit of iaTAR and the ability to specify exact values was the foremost merit of TARML.

The users of iaTAR complained about the standard AR related

usability problems, such as the narrow field of view, degraded depth perception, the difficulty with continuous tracking and the inability to specify exact values.

Consequently, the subjects expressed their

preference in a specific manner.

They preferred iaTAR for spatial authoring tasks, and TARML for the

non-spatial and fine-tuning tasks.

At the end of the questionnaire, the participants were asked for their

preferences about the authoring methods.

All the participants preferred to have both methods available

in the authoring system.

While authoring of most TAR content will involve both spatial and non-spatial (e.g. logical/discrete) tasks, their relative proportion will be different every time, depending on the particular target content.

In

addition, it is often possible to metaphorically convert logical tasks to spatial tasks, which is one of the strengths and purposes of the visual programming approach. technique [19] is also a very good example.

The programming by demonstration

One can envision the authoring tool incorporating both

immersive and desktop features, which can be used in a flexible manner depending on the nature of the target application/content.

That is, if the target content is composed of many 3D spatial behaviors, the

system can be used in an immersive manner (2D/logical tasks are difficult to perform in immersion). Conversely, in a 2D desktop, 3D tasks are performed either in a limited 3rd person viewpoint, or by intermittently donning the HMD and immersing oneself (See Figure 13).

An alternative is to initially

perform design/implementation of content, with the spatial aspect only vaguely specified using text input and 2D GUI, then complete the final design and validation in executable mode within a larger content development process.

Figure 14 illustrates the case for which different types of authoring tools are used

during the development process, e.g. 2D GUI / editing for initial rough design and detailed logical object behaviors, and immersive tools for final adjustment, spatial behaviors and validation.

6. Conclusion and Future Work

In this paper, we proposed the concept of immersive authoring for TAR content.

The main purpose of

immersive authoring is to unify the development and executable environment and improve the authoring efficiency.

This was predicted to be the case because immersive content, like TAR content, and unlike

other software, requires in-situ evaluation with respect to human usability and perception. implemented a prototype immersive authoring system called iaTAR.

We

We also presented a usability

experiment to further support the advantages that can be gained with immersive authoring.

TAR content authoring involves a variety of types of tasks.

Immersive authoring is advantageous in

general, because many of the object behaviors involve 3D spatial tasks, such as scene configuration, motion specification, motion coordinate and etc.

Note that with the emerging mobile mixed reality

content, the importance of immersive tools will increase also.

However, this will only be the case

provided that 2D interaction can be accomplished quite efficiently in the immersive environment, which has not yet been achieved with current devices.

Thus, at present, we propose that an ideal TAR

authoring system should include both types of functionality, 2D editing and 3D immersive mode, each of which is selected depending on the nature of the target content and the phases of the content development process.

Such a tool will be ideal for the effective communication and collaboration between non-

technical artists and programmers, during content development.

There are still many road blocks to making immersive authoring more practical.

Aside from problems

with the devices (such as cost, accuracy and usability – which – however –are constantly improving), a comprehensive authoring system must be able to incorporate and reuse various types of virtual objects and behavior models.

Interaction and system issues for collaborative immersive authoring are an

interesting future research direction.

Despite these road blocks, we believe that immersive authoring is

one of the keys to making AR technology develop into the main stream of future media and human computer interface.

Acknowledgements This research was in part supported by the “Teukjung Gicho” program of the Korea Science Foundation (Grant No. R01-2006-000-11142-0).

References [1] ARToolKit, 2007, http://www.hitl.washington.edu/artoolkit. [2] Berry, R., Hikawa, N., Makino, M., Suzuki, M. and Furuya, T., 2004. Authoring augmented reality: A code-free approach, Proc. of ACM SIGGRAPH, August, pp.8-12. [3] Billinghusrt, M., Kato, H., Poupyrev, I., 2001. The MagicBook – Moving Seamlessly between Reality and Virtuality. IEEE Computer Graphics and Applications, 21(3), pp. 6-8. [4] Bowman, D. A. & Hodges, L. F., 1995. User Interface Constraints for Immersive Virtual Environment Applications. Technical Report of Graphics, Visualization, and Usability Center, GIT-GVU-95-26, Georgia Institute of Technology. [5] Butterworth, J., Davidson, A., Hench, S. & Olano, T. M., 1992. 3DM: A Three Dimensional Modeler Using a Head-Mounted Display. Proceedings of Symposium on Interactive 3D Graphics, pp. 135-138. [6] Cheok, A., Goh, K., Liu, W., Farbiz, F., Fong, S., Teo, S., Li, Y. and Yang, X., 2004. Human Pacman: A Mobile, Wide-area Entertainment System based on Physical, Social, and Ubiquitous computing, Personal and Ubiquitous Computing, Vol. 8, No. 2. [7] Foley, J., van Dam, A., Feiner, S. and Hughes, J., 1990. Computer Graphics: Principles and Practice, 2nd ed., Addison-Wesley, Reading, U.S.A. [8] Greenhalgh, C., Izadi, S., Mathrick, J., Humble, J. and Taylor, I., 2004. A Toolkit to Support Rapid Construction of Ubicomp. Environments, Proc. of UbiSys. [9] Grimm P., Haller, M., Paelke, V., Reinhold, S., Reimann, C., & Zauner, J., 2002. AMIRE – Authoring Mixed Reality, The First IEEE International Augmented Reality Toolkit Workshop. [10] Guven, S. and Feiner, S., 2003. Authoring 3D Hypermedia for Wearable Augmented and Virtual Reality, Proc. of the Seventh IEEE International Symposium on Wearable Computers. [11] Haller, M., Billinghurst, M., Thomas, B., 2006. Interaction Design for Tangible Augmented Reality Applications (Chapter XIII), Emerging Technologies of Augmented Reality: Interface and Design, pp. 261-282. [12] Hampshire, A., Seichter, H., Grasset, R. and Billinghurst, M., 2006. Augmented Reality Authoring: Generic Context from Programmer to Designer, Proc. of OZCHI, pp 409-412. [13] Haringer, M. and Regenbrecht, H., 2002. A Pragmatic Approach to Augmented Reality Authoring, Proc. of ISMAR, pp. 237-245.

[14] HITLabNZ Projects, 2007. www.hitlabnz.org/index.php?page=projects. [15] Kato. H., Billinghurst, M., Poupyrev, I., Immamoto, K., & Tachibana, K., 2000. Virtual Object Manipulation on a Table-Top AR Environment. Proc. of the International Symposium on Augmented Reality, pp. 111-119. [16] Kim, G. J., Kang, K., Kim, H. & Lee, J., 1998. Software Engineering of Virtual Worlds. Proceedings of Virtual Reality Software & Technology, pp. 131-139. [17] Kim, S. & Kim, G. J., 2004. Using Keyboards with Head Mounted Displays. Proc. of ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry. [18] Ledermann F. and Schmalstieg, D., 2005. APRIL: A High-level Framework for Creating Augmented Reality Presentations, Proc. of the IEEE Virtual Reality. [19] Lee, G. A., Kim, G. J. and Park, C.-M., 2002. Modeling Virtual Object Behavior within Virtual Environment. Proceedings of ACM Symposium on Virtual Reality Software and Technology (VRST), Hong Kong, China, Nov.11-13, pp.41-48. [20] Lee, G. A., Billinghurst, M. and Kim, G. J., 2004. Occlusion based Interaction Methods for Tangible Augmented Reality Environments. Proc. of ACM SIGGRAPH International Conference on VirtualReality Continuum and its Applications in Industry. [21] Lee, G. A., Nelles, C., Billinghurst, M. and Kim, G. J., 2005. Immersive Authoring of Tangible Augmented Reality Applications, Proc. of IEEE and ACM International Symposium on Mixed and Augmented Reality, pp.172-181. [22] Lee, G. A., Kim, G. J., and Billinghurst, M., 2005. Immersive Authoring: What You eXperience Is What You Get, Communications of the ACM Vol. 48 No. 7, pp. 76-81. [23] Liang, J. & Green, M., 1994. JDCAD: A Highly Interactive 3D Modeling System. Computer & Graphics, 18(4), pp. 499-506. [24] MacIntyre, B., Gandy, M., Dow, S. and Bolter, J., 2004. DART: A Toolkit for Rapid Design Exploration of Augmented Reality Experiences, Proceedings of the 2004 ACM Symposium on User Interface Software and Technology, pp. 197-206. [25] Mine, M. R., 1995. ISAAC: A Virtual Environment Tool for the Interactive Construction of Virtual Worlds. Technical Report of Department of Computer Science, CS TR95-020, UNC, Chapel Hill. [26] Open Computer Vision Library (OpenCV), 2006, http://sourceforge.net/projects/opencvlibrary/

[27] Pausch R., Burnette, T., Capehart, A. C., Conway, M., Cosgrove, D., DeLine, R., Durbin, J., Gossweiler, R., Koga, S., and White, J., 1995. Alice: Rapid Prototyping for Virtual Reality, IEEE Computer Graphics and Applications 15(3), pp. 8-11. [28] Piekarski, W. & Thomas, B. H., 2004. Augmented Reality Working Planes: A Foundation for Action and Construction at a Distance, Proc. of Intl. Symposium on Mixed and Augmented Reality. [29] Poupyrev, I., Tan, D. S., Billinghurst, M., Kato, H., Regenbrecht, H. & Tetsutani, N., 2001. Tiles: A Mixed Reality Authoring Interface. Proc. of INTERACT, pp. 334-341. [30] Rizzo, A., Buckwalter, J., Bowerly, T., Yeh, S., Hwang, J., Thiebaux, M. and Kim, G. J., 2004, Virtual Reality Applications for Assessment and Rehabilitation of Cognitive and Motor Processes, 15th Congress of the International Society of Electrophysiology & Kinesiology. [31] Schmalstieg, D., Fuhrmann, A., Hesina, G., Szalavari, Z., Encarnacao, L.M., Gervautz, M. and Purgathofer, W., 2002. The Studierstube Augmented Reality Project, Presence - Teleoperators and Virtual Environments, 11(1), pp. 33-54. [32] Seo, J., Kim, N. and Kim, G. J., 2006. Designing Interactions for Augmented Reality Based Educational Contents, Proc. of Edutainment, LNCS 3942, pp. 1187-1196. [33] Shneiderman, B. and Plaisant, C., 2005, Designing the User Interface, 4th Ed., Addison Wesley. [34] Steed, A. & Slater, M., 1996. A Dataflow Representation for Defining Behaviours within Virtual Environments. Proc. of Virtual Reality Annual International Symposium, pp. 163-167. [35] Stiles, R. & Pontecorvo, M., 1992. Lingua Graphica: A Visual Language for Virtual Environments. IEEE Workshop on Visual Languages, pp. 225-227. [36] VRML Specification, 1997, http://www.web3d.org/vrml/vrml.htm [37] Wesche, G. & Seidel, H., 2001. FreeDrawer – A Free-Form Sketching System on the Responsive Workbench. Proc. of Virtual Reality Software & Technology, pp. 167-174. [38] Vitzthum, A., 2006. SSIML/AR: A Visual Language for the Abstract Specification of AR User Interfaces. Proc. 3DUI, pp. 135-142.

List of Tables

Table 1: Possible subtasks for immersive authoring for/with TAR.

Table 2: Matching the authoring subtasks to four different styles of authoring interfaces: using specialized tools, programming or text input, 2D GUI, and 3D immersive interfaces.

Table 3: Interfaces for major tasks in iaTAR.

Table 4: Two examples of requirement specifications, one spatial, the other non-spatial., given to the subject in the experiment.

Table 5: Experimental procedure.

Table 6: Average values for various features of the subject pool. Table 7: Task completion time. Table 8: Subjective rating results. Table 9: De-briefing results: Strengths and weaknesses of each method. Table 10: De-briefing results: Inconvenience of each method. Table 11: The average task completion time per trial according to the authoring tools for each subject group.

Table 1: Possible subtasks for immersive authoring for/with TAR. Category

Tasks and Subtasks

Objects

Form specification

Specification

Geometric modeling (not covered in this paper) (Sub) Object placement / rotation (Reconfiguration) Shape modification (e.g. Scaling) Discrete attribute setting

Function Specification

Scripting / Programming (Self) Motion specification (e.g. Animation)

Behavior Specification /

Scripting / Programming

User Interaction

Model based specification (e.g. Event-Action rules) (Gross) Motion specification Behavioral coordination (e.g. Synchronization of concurrent behaviors) Object to object linking (Routes, Data flow) 2D Interaction specification 3D Interaction specification

Object-Prop Binding

Associating a virtual object with a specific prop or real object

Scene

Object placement / deletion

Specification

Object hierarchy specification Scene-wide settings (Sound effects, Lighting effects, Camera placements)

Content Review

Deployment and testing individual objects Running and testing entire content

Miscellaneous

Version Management (Save, Replay) Information Browsing / Note Taking Usability / Presence assessment

Table 2: Matching the authoring subtasks to four different styles of authoring interfaces: using specialized tools, programming or text input, 2D GUI, and 3D immersive interfaces. ◎: most appropriate, ○: can be easily done, △: difficult, X: almost impossible. Category

Tasks and Subtasks

Specialized Tool (e.g. Geometric Modeler)

Programming

2D GUI /

3D /

Text / Script

Menu /

Immersive

Input

Drag/Drop

Form

Geometric modeling



X





specification

(Sub) Object Reconfiguration

X







Shape modification

X

X





Discrete attribute setting

X







Function

Scripting / Programming









Specification

(Self) Motion specification



X





(Animation) Behavior

Scripting / Programming

X







Specification

Model based specification

X







(Gross) Motion specification

X







Behavioral Coordination

X







Object to object linking

X







2D Interaction specification

X







3D Interaction specification

X







Object-Prop

Object to prop binding

X







Binding

a specific prop or real object

Scene

Object placement / deletion

X

X





Specification

Object hierarchy specification

X







Scene-wide settings

X







Content

Deployment and testing

X







Review

objects X







Version Management

X

X





Info. browsing / Note taking

X

X





Usability / Presence

X

X





/ User Interaction

Running/testing entire content Miscellaneous

assessment

Table 3: Interfaces for major tasks in iaTAR. Category Objects

/

Scene

Tasks and Subtasks

Interaction / Interface

Object placement / rotation

Direct 3D manipulation (Props)

Shape modification (e.g. Scaling)

Prop based 2D GUI /

Specification Direct 3D manipulation (Props) Discrete attribute setting

Prop based 2D GUI

Scripting / Programming

Real keyboard

Motion specification

Direct 3D manipulation (Props) / PBD

Object/Prop Binding/Linking

Prop based 2D GUI / Direct 3D manipulation (Props)

Timing Coordination

Prop based 2D GUI / Direct 3D manipulation (Props)

Miscellaneous

Deployment / Testing

Prop based 2D GUI

Version management / System control

Prop based 2D GUI

Navigation / Review

Natural body motion

Information browsing / Note taking

Real keyboard / Real or virtual terminal / Voice Recording

Table 4: Two examples of requirement specifications, one spatial, the other non-spatial., given to the subject in the experiment. Spatial tasks

Non-spatial (Logical) tasks

0. Use page 1 to construct this scene.

0. A following scene is given. - There are two marker cards.

1. Place the terrain (“underhill.obj”). - The terrain must be parallel to page 1 - The hill top must be on the right-hand side. - Scale it up to fit the size of page 1.

- Two models of a rabbit in different poses are placed on the first card, and likewise two models of a turtle on the other. - A logical object with the function of checking distance is provided.

2. Place the tree ( “tree.obj”) - Place it on the hill top. - Make sure the tree is neither floating

1. Connect the ‘position’ properties of both cards to the ‘input position 1’ and ‘input position 2’ properties of the logical object.

over nor buried under the ground. - The flat side of the tree must face the road on the hill.

2. Connect the ‘far’ property of the logical object to the ‘visible’ property of the rabbit standing upright.

3. Place the rabbit (“rabbit.obj”) and the turtle (“turtle.obj”) - Both of them are at the left front end of

3. Do the same to the turtle standing upright.

the road, under the hill. - The rabbit is on the left of the turtle. - Both of them are facing the front while slightly turned to face each other.

4. Connect the ‘near’ property of the logical object to the ‘visible’ property of the rabbit with its hands up.

5. Do the same to the turtle with its hands up.

6. Change the value of the ‘threshold’ property of the logical object to 100.

Table 5: Experimental procedure. Sessions

Detail tasks

Time

Subject feature survey questionnaire

5 min.

Instructions: Overview of Tangible AR content

10 min.

Instructions: Tutorial on TARML

5 min.

Practice: TARML

15 min.

Instructions: Tutorial on iaTAR

5 min.

Practice: iaTAR

15 min.

Post-training survey questionnaire

5 min.

Training (1 hour)

Break

5 min. A practice trial for both iaTAR and TARML with a particular content specification.

Experiment 6 trials for both iaTAR and TARML with particular (1.5 hour)

content specifications. Post-experiment survey questionnaire

Approx. 90 min.

Table 6: Average values for various features of the subject pool. Subject Group

All

With VR

With no VR

background

background

ANOVA

Feature Number

12

6

6

-

Gender

All male

All male

All male

-

22.2

22.5

21.83

F(1,10) = 0.30

(SD=2.04)

(SD=1.76)

(SD=2.40)

p = 0.5954

2.8

4.3

1.25

F(1,10) = 39.57

C/C++/JAVA (years)

(SD=1.80)

(SD=1.03)

(SD=0.61)

p < 0.0001

Experience with HTML

2.2

3

1.3

F(1,10) = 2.23

(years)

(SD=2.04)

(SD=2.45)

(SD=1.21)

p = 0.1660

242.4

268.7

216.2

F(1,10) = 1.37

(SD=78.89)

(SD=84.19)

(SD=70.35)

p = 0.2683

Age (years) Experience

with

Typing skill (chars/m)

Table 7: Task completion time.

Total task

For spatial tasks only

For non-spatial tasks only

iaTAR

TARML

ANOVA

27m 43s

38m 37s

F(1,11) = 37.20

(SD = 5m 10s)

(SD = 7m 44s)

p < 0.0001

12m 31s

25m 12s

F(1,11) = 99.94

(SD = 2m 50s)

(SD = 4 m47s)

p < 0.0001

15m 12s

13m 24s

F(1,11) = 6.20

(SD = 2m 37s)

(SD = 3m 28s)

p = 0.0301

Table 8: Subjective rating results. Ratings on Likert 7-scales (0: difficult or not confident, 3: neutral, 6: easy/confident). iaTAR

TARML

Ease of learning

4.50 (SD = 1.168)

3.92 (SD = 1.240)

Ease of use

4.08 (SD = 0.996)

4.25 (SD = 1.138)

3.92 (SD = 1.165)

4.50 (SD = 0.905)

Confidence authoring results

with

ANOVA F(1,11) = 4.11 p = 0.0674 F(1,11) = 0.17 p = 0.6887 F(1,11) = 3.01 p = 0.1106

Table 9: De-briefing results: Strengths and weaknesses of each method. Questions

Answers

No.

participants

Easy when placing virtual objects

7

Easy to learn

5

Can check the results instantly (concurrently)

3

Coarse object positioning (fine positioning not available)

7

Tracking lost

3

Narrow field of view

2

Inconvenience in changing property values

2

Undo function unavailable

1

Easy to assign exact values to the property

6

Strengths of

Fast when given exact specification values

4

TARML

Copy and paste function available

1

Easy syntax

1

Need to learn grammars (not relatively easy to learn)

6

Weaknesses of

Repeated trial-and-error when placing objects

5

TARML

Need to reload the script after editing

2

Hard to perceive rotations in three-dimensional space

1

Strengths of iaTAR

Weaknesses of iaTAR

of

Table 10: De-briefing results: Inconvenience of each method. Questions

Answers

No.

participants

Narrow field of view

10

Unclear focus with HMD

4

Depth perception problem

1

Inconvenience with

Repeated trial-and-error

6

TARML

Three-dimensional space perception

1

Inconvenience with iaTAR

of

Table 11: The average task completion time per trial according to the authoring tools for each subject group. Non-spatial tasks Subject group

iaTAR

TARML

Spatial tasks

VR class

Non-VR class

5m 8s

5m 0s

(SD=1m 25s)

(SD=1m 21s)

3m 55s

5m 1s

(SD=1m 22s)

(SD=1m 34s)

Subject group

iaTAR

TARML

VR class

Non-VR class

4m 12s

4m 9s

(SD=1m 1s)

(SD=1m 27s)

7m 39s

9m 9s

(SD=2m 3s)

(SD=1m 41s)

List of Figures Figure 1: A typical desktop non-immersive authoring tool for augmented reality [8]. Figure 2: A Tangible AR interactive story book built with iaTAR. Figure 3: Four examples of AR content with different task requirements: (a) Circulation of water and makings of rain [32] which requires extensive immersive 3D interaction design (top row), (b) MagicBook type [14] of application that only needs association of markers with pre-built graphical objects/behaviors (bottom left), and (c) mobile AR content, the Human Pac-man [6], which is very difficult to implement using mere programming or desktop tools (bottom right).

Figure 4: An example of AR content represented in TARML.

The content shows a rotating virtual fish

on a physical card (prop), making bubble sounds when the card is visible.

Figure 5: Browsing through available objects and selecting one, using props. Figure 6: Using the inspector pad to browse through the object attributes and their values. Figure 7: Recording the motion profile of two objects using two hands Figure 8: Representing routes between object attributes; the visibility of the virtual fish is connected to the visibility of a single marker, showing that the value is being updated according to the value of the connected attribute. Figure 9: Using the real keyboard and text overlay in iaTAR. Figure 10: Experimental environments for script editing (left) and immersive authoring (right). Figure 11: Total task completion time (minutes:seconds). Figure 12: Average task completion time between participant groups (minutes:seconds). Figure 13: An ideal authoring tool for immersive content.

Depending on the object types and behaviors

prevalent in the content, the user might choose to use the 2D or 3D immersive mode.

Spatially dynamic

content will be more efficiently authored using the 3D immersive mode and vice versa.

Figure 14: Using different tools during the development process, e.g. using 2D GUI / Editing for initial vague design and detailed logical object behaviors, and using immersive tools for final adjustment, spatial behaviors and validation.

Figure 1: A typical desktop non-immersive authoring tool for augmented reality [8].

Figure 2: A Tangible AR interactive story book built with iaTAR.

Figure 3: Four examples of AR content with different task requirements: (a) Circulation of water and makings of rain [32] which requires extensive immersive 3D interaction design (top row), (b) MagicBook type [14] of application that only needs association of markers with pre-built graphical objects/behaviors (bottom left), and (c) Mobile AR content, the Human Pac-man [6], which is very difficult to implement using mere programming or desktop tools (bottom right). (If the article is accepted, we will either get permission to print these referred figures or replace them with our own).

Figure 4: An example of AR content represented in TARML. Content shows a rotating virtual fish on a physical card (prop), making bubble sounds when the card is visible.

Figure 5: Browsing through available objects and selecting one, using props.

Figure 6: Using the inspector pad to browse through the object attributes and their values.

Figure 7: Recording the motion profile of two objects using two hands.

Figure 8: Representing routes between object attributes; the visibility of the virtual fish is connected to the visibility of a single marker, demonstrating that the value is being updated according to the value of the connected attribute.

Figure 9: Using the real keyboard and text overlay in iaTAR.

Figure 10: Experimental environments for script editing (left) and immersive authoring (right).

Figure 11: Total task completion time (minutes:second).

Figure 12: Average task completion time between participant groups (minutes:seconds).

Figure 13: An ideal authoring tool for immersive content. Depending on the object types and behaviors prevalent in the content, the user might choose to use the 2D or 3D immersive mode. Spatially dynamic content will be more efficiently authored using the 3D immersive mode and vice versa.

Figure 14: Using different tools along the development process, e.g. use 2D GUI / Editing for initial vague design and detailed logical object behaviors, and using immersive tools for final adjustment, spatial behaviors and validation.