ZASH: A Browsing System for Multi-Dimensional Data - CiteSeerX

1 downloads 0 Views 3MB Size Report
visibility of links between nodes; (3) the use of multi- dimensional scaling to lay out movies and commenta- tors so that the similar data are placed physically ...
ZASH: A Browsing System for Multi-Dimensional Data Emiko Orimo ∗ Department of Electronic Engineering University of Electro-Communications Chofu, Tokyo 182-8585, Japan [email protected]

Abstract This paper described a browsing system for movie database. The system, named ZASH, was designed and developed to explore the following features: (1) the use of multiple 2D planes to separate different types of information; (2) the use of 3D space to improve the visibility of links between nodes; (3) the use of multidimensional scaling to lay out movies and commentators so that the similar data are placed physically near each other; (4) the use of graphical fisheye views to improve the visibility of users’ focus and its neighbors. The system also provides traditional searching methods (e.g. keyword, director, production year, etc.). By integrating browsing techniques and searching techniques, the system gave users more chances to find movies.

1

Introduction

With recent advances in computer systems, information which people can obtain is increasing more and more. On the other hand, it becomes harder to find the information which people really need from such huge amount of information. Information retrieval technologies are used to solve this problem, and they are roughly divided into two categories, searching and browsing. Searching often requires users to input keywords that typically represent the target information. This keyword searching is the most popular method in recent information retrieval systems. It is, however, difficult to find appropriate information when users are not sure what they really want to find. Moreover, the search result is often presented as text. It is also difficult to understand the relation between each data. ∗ Current affiliation: DNP DIGITALCOM, NT building, 387-4, Haramachi, Shinjuku, Tokyo 162-0053, Japan, Email: [email protected]

Hideki Koike Graduate School of Information Systems University of Electro-Communications Chofu, Tokyo 182-8585, Japan [email protected]

On the other hand, browsing is a method to find the information by observing a set of information. Visualization is often used in such browsing systems. By visualizing each data as well as relations between the data, the system helps users to understand information in the database. As is described in the next section, however, the browsing systems developed to date have some problems. This paper described a browsing system for movie database. The system, named ZASH, was designed and developed to explore new browsing techniques as well as to integrate some advantages in other browsing systems. The paper is organized as follows. Section 2 surveys related work and discusses their problems. Then, design features of ZASH are described in Section 3. Section 4 shows an overview of ZASH and Section 5 demonstrates how ZASH is used to retrieve data. Section 6 discusses advantages and disadvantages of ZASH. Finally, Section 7 concludes the paper.

2

Related Work

DocSpace[8] is a visual information retrieval system for documents. In DocSpace, documents and keywords are displayed as node, and relations between each document and keywords are displayed as link. DocSpace introduces dynamic arrangement of nodes. When a keyword or a document is dragged, other nodes follow the movement according to the relevance. However, since keywords and documents are displayed on the same 2D plane, the visualization becomes very complicated. Cinema Scape[9] is an information retrieval system for movie data. The system displays movie titles as node in a 2D plane and commentators in another 2D plane. In each plane, the movie titles and the commentators are laid out by using multi-dimensional scaling so that the similar data are placed physically near each other. It is, however, hard to understand the relation

between the movie titles and the commentator since users can see only one plane at a time. The system does not provide other ways to access the data, such as production year, directors, and so on. SemNet[3] is a pioneering 3D visualization work and it explored a multi-dimensional scaling to lay out many nodes automatically. However, it was reported that, since a huge number of nodes are laid out in a 3D space, the visualization was very complicated and therefore it was hard to find the target. Starlight[6] is a visualization system for multimedia intelligence data. The system displays some 2D planes in a 3D space, and each 2D plane shows a different aspect of the data. By connecting the data on the different planes, the system helps users to understand the relation between these data. In Information Visualizer project[2], many visualization systems were developed. Each system focused on a specific information (e.g. hierarchical data, temporal data, etc.) and therefore succeeded to show its effectiveness. However, the system for a multi dimensional data which we try to use has not been reported.

3

General Approach

Figure 1. The use of multiple 2D planes.

3.2

3D visualization

Even if multiple planes are used, the visibility of links is still low. ZASH used a 3D space to minimize this problem. In Figure 2(a), a top node and a bottom node are connected by link. However, users might think that a center node is connected with the top node. On the other hand, if two planes are presented in 3D space as shown in Figure 2(b), it is clear that the center node is not connected with the top node.

Based on the observations in the previous section, we designed and developed a browsing system for movie data. The system, named ZASH, provides many ways to retrieve the movie data. Particularly, the following visualization techniques are focused on in ZASH.

3.1

Multiple planes

As is seen in DocSpace[8] or SemNet[3], if different kinds of information are displayed on the same 2D plane (or in 3D space), it is difficult to identify each node. Particularly the movie data such as we use have many properties (e.g. title, keyword, director, actor, production year, etc.). It is easily imagined that the visualization would become very complicated if all the information are displayed in the same plane. To overcome this problem, ZASH uses multiple planes. Each data in the same category are displayed on each plane. Then, relations between the data are displayed as link. In Figure 1(a), movie titles, keywords, and commentators are laid out on the same plane. On the other hand, in Figure 1(b), the nodes are classified by their categories and are laid out on separate planes. It avoids mixture of different types of nodes, and helps to reduce the complexity of the display.

Figure 2. The use of 3D space improves the visibility of links.

3.3

Multi-dimensional scaling

Multi-dimensional scaling (MDS) is a method to make some complex phenomenon clear. It is widely used in various fields. This method analyzes mutual relations between categories. ZASH used “quantification theory type III” of MDS to lay out nodes. The “quantification theory type III” enables to calculate geometrical positions so that the similar data

Table 1. A relation between titles and keywords. Title Reality Bites Mars Attacks! The Godfather Star Wars Spartan X Aliens

Drama 1 0 1 1 0 0

Action 0 1 0 1 1 1

Horror 0 1 0 0 0 1

are placed physically near each other. This method is used to see mutual relation of categories and discover some new factors. For example, Table 1 shows a relation between six movie titles and three keywords. In Table 1, “1” means the title has the keyword and “0” means it does not. This table is applied to the calculation of “quantification theory type III.” Categories and samples are changed to category-score and sample-score by this calculation. Using these scores, each position of nodes is decided. Figure 3 shows an example layout of the titles using this calculation.

Figure 4. Calculation of DOI using FractalViews.

of interest (DOI) of each node, we used FractalView[5] which is a variation of Furnas’s fisheye views[4]. The DOI is used to calculate the size of each node. Consider, for example, a user’s focus is on a title node as shown in Figure 4. This title node is connected to three keywords and two commentators. Therefore DOIs of these nodes are calculated as 5. The detail of the algorithm is described in [5].

4

ZASH

Figure 5 shows an overview of ZASH. Each data is classified by its category, such as title, keyword, commentator, and so on, and is laid out on five planes in 3D space. Each plane, called grid, has individual function and it supports users’ retrieval.

Figure 3. Titles are laid out by using MDS.

Currently, ZASH uses 100 titles, 20 keywords, 20 commentators, 85 directors, 158 actors, and production year of each title.

3.4

Fisheye views

To reduce the complexity of the visualization, a fisheye view is introduced in ZASH. To calculate a degree

Center grid The center grid, which is indicated as A in Figure 5, contains title nodes and it is the main grid of the system. The position of each node is calculated by MDS. As a result, the similar movies are placed physically near each other. When a user selects a title node, keywords corresponding to the title are indicated by displaying links between the title and the keywords on the left grid. In the same way, links between the title and commentators are displayed. Left grid Keyword nodes are laid out on the left grid (B in Figure 5) in alphabetical order. Using this grid,

can retrieve the similar movies by looking around its neighbors. In Figure 6(a), the user’s focus is on “Poltergeist.” The user can find “The Omen” (Figure 6(b)) and “Friday the 13th” (Figure 6(c)) near the “Poltergeist.” It shows that horror movies could be found in this area.

Figure 5. Overview of ZASH.

a user can find movies which have a certain keyword. When the user selects a keyword, movies which have this keyword are indicated by showing links between the keyword and the title. Lower grid On the lower grid (C in Figure 5), the title nodes are laid out in alphabetical order along vertical axis and in chronological order along horizontal axis. Upper grid On the upper grid (D in Figure 5), commentator nodes are laid out by using MDS. It means that commentators who have the similar likes and dislikes are placed physically near each other. Right grid The right grid (E in Figure 5) has two categories of node. One is director node and the other is actor node. The director nodes and the actor nodes are laid out in alphabetical order in the upper half and the lower half of the grid, respectively. If a director’s name or an actor’s name of the movie is already known, a user can start retrieval by selecting the node.

5

Example Retrieval

Figure 6. The similar movies are placed physically near each other by using MDS.

Common keywords When a user can tell any keyword, he/she can use the keyword for retrieval. For example, a movie “Hamburger Hill” has four keywords, “War,” “Action,” “Drama,” and “Classic.” In Figure 7(a), the user’s focus is on “War” on the left grid, and two movie titles, “Hamburger Hill” and “Born on the Fourth of July,” are displayed in larger size on the center grid. Then the user changes his/her focus to the “Born on the Fourth of July” and the node is displayed in larger size (Figure 7(b)).

5.2 5.1

Retrieval by similarity

Geometrical closeness As is described previously, title nodes on the center grid and commentator nodes on the upper grid are laid out by using MDS. Therefore, the similar nodes are placed physically near each other. When a user has his/her favorite movie, he/she

Retrieval by commentators’ recommendation

Reliable commentator If a user has a reliable commentator, he/she would select the commentator node on the upper grid. Then, movies recommended by the commentator are getting larger on the center grid (Figure 8(a)).

Figure 8. Finding reliable commentators’ recommendations.

Figure 7. A common keyword is used to retrieve the similar movies.

Similar commentator Commentator nodes are laid out also by using MDS. A user can find other commentators who have the similar likes and dislikes as the user’s reliable commentator by seeing its neighbors. For example, if the user’s reliable commentator recommends too many movies (Figure 9(a)), the user can change his/her focus to another commentator close to the reliable one.

5.3

Fisheye view

ZASH used graphical fisheye views algorithm[7] to enhance the visibility of title nodes. In Figure 10(a), there are too many nodes around a focus to identify each title. In this case, a user would change display mode to fisheye view mode. The nodes change their size based on their DOI value and move away from the focus with smooth animation (Figure 10(b)).

5.4

Additional retrieval

Although the main focus of ZASH is to support browsing, it is better to provide traditional searching capabilities. Title and year If a user already knows the first letter of a title and approximate production year, he/she

may use the lower grid for retrieval. For example, if the first letter of the title is “B” and the movie was produced in the last few years, the user would select one of the nodes which are in the area shown in Figure 11(a). Then, a node representing “BEAN” is getting larger on the center grid, and is connected with the selected node on the lower grid. Keyword If a user already knows a keyword of a movie, the user can start retrieval from the left grid. In Figure 12(a), a keyword “Music” is selected and titles which have this keyword become larger on the center grid. Keyword and year If a user already knows a keyword and approximate production year, the user can search the movie using the lower grid and the left grid. Consider, for example, the movie was produced in the 1980’s and the keyword is “Horror.” The user first selects a node representing 1980’s on the lower grid (Figure 13(a)). Then, he/she selects the keyword node “Horror” on the left grid (Figure 13(b)). As a result, movies which were produced in the 1980’s and has the keyword “Horror” are displayed on the center grid (Figure 13(c)). Director or actor If a director or an actor is already known, a user may start retrieval from the right grid. Consider, for example, a director is “Woody Allen.” The user selects a node for “Woody Allen” on the right grid (Figure 14(a)). Then, all movies directed by Woody Allen are getting larger on the center grid

Figure 11. Searching by title and production year.

Figure 9. Seeing the recommendation by another commentator who is near the user’s reliable one.

Figure 10. Using graphical fisheye view to improve the visibility of the focus and its neighbors. Figure 12. Searching by keywords.

and are connected with the node for “Woody Allen.”

5.5

Accessing to movie database

ZASH also provides a way for accessing to movie database on Internet. In Figure 15, a user selected a title “Apollo 13” and chose “Open in Netscape” from pop-up menu. The detail of “Apollo 13” was retrieved from the movie database site and was displayed in Netscape.

6

Discussion

Multiple planes Because of the multiple planes, users can focus on each set of data. The users can al-

so understand relations between data on the different planes. ZASH succeeded to separate each information without hiding their relations. However, it is sometimes more convenient to use traditional GUIs instead of 2D planes (e.g. keywords). Multi-dimensional scaling ZASH used MDS to lay out titles and commentators. As is shown in Figure 6, this algorithm seems to work. However, it completely depends on the selection of appropriate keywords. Fisheye views Graphical fisheye view improved the visibility of a focus and its neighbors. However, the algorithm should be improved for more effective use of the screen.

Figure 14. Searching by a director’s name.

order to provide users various ways to find movies. Currently, ZASH is displayed on a normal computer screen. However, to make browsing more effective, we are planning to use a larger screen (e.g. 100 inch projector). Also, the formal user studies should be done. Figure 13. Searching by keyword and production year.

7

Conclusions and Future Work

This paper described a browsing system, named ZASH, for movie database. The system used multiple 2D planes in 3D space so that the same type of information are displayed in each plane. The use of 3D space improved the visibility of links. Movie titles and commentators were laid out by using MDS so that the similar data are placed physically near each other. Moreover, a graphical fisheye view algorithm is implemented to improve the visibility of the focus and its neighbors. Although some browsing techniques used in ZASH are not new, ZASH integrated these browsing techniques as well as traditional searching techniques in

References [1] L. Bartram, et al, “The Continuous Zoom: A Constrained Fisheye Technique for Viewing and Navigating Large Information Spaces”, Proc. of ACM Symposium on User Interface Software and Technology (UIST’95), pp.14-17, November 1995. [2] S. K. Card and G. G. Robertson and J. D. Mackinlay, “The Information Visualizer, an information workspace,” Proc. of the ACM Conference on Human Factors in Computing Systems (CHI’91), pp.181-188, 1991. [3] K. M. Fairchild and S. E. Poltrock and G. W. Furnas, “SemNet: Three-Dimensional Graphic Representation of Large Knowledge Bases,” In R. Guindon Eds, Cognitive Science And Its Applications For Human-Computer Interaction, Lawrence Erlbaum Associates, pp.201-233, 1988.

[5] H. Koike, “Fractal Views: A Fractal-Based Method for Controlling Information Display,” ACM Trans. on Information Systems, Vol.13, No.3, pp.305-323, 1995. [6] J. S. Risch, et al, “A Virtual Environment for Multimedia Intelligence Data Analysis”, IEEE Computer Graphics and Applications, pp.33-41, November 1996.

Figure 15. Accessing to the movie database on Internet.

[4] G. W. Furnas, “Generalized Fisheye Views,” Proc. of the ACM Conference on Human Factors in Computing Systems (CHI’86), pp.16-23, 1886.

[7] M. Sarkar, et al, “Graphical Fisheye views of graphs”, Proc. of the ACM Conference on Human Factors in Computing Systems (CHI’92), pp.8391, 1992. [8] J. Tatemura “Visualizing Document Space by Force-directed Dynamic Layout”, Proc. of 1997 IEEE Symposium on Visual Languages (VL’97), pp.119-120, 1997. [9] J. Tatemura “Cinema Scape”, http://wwwtate.iis.u-tokyo.ac.jp/~tatemura/Cinemascape/.

Suggest Documents