Recently, some Web search engines have incorporated facets or terms .... search engines to search for the familiar term ââsocial networksââ or the unfamiliar term.
349
Elizabeth Milonas (Long Island University, New York)
The use of facets in Web search engines Abstract The World Wide Web consists of a plethora of information that a Web searcher can retrieve via Web search engines such as Google. These Web search engines display an insurmountable amount of information in a seemingly unorganized linear format. Recently, some Web search engines have incorporated facets or terms alongside the linear display allowing the searcher the ability to narrow search results. The goal of this study is to examine the use of facets in these Web search engines.
1: Introduction Faceted classification has its root in the Colon Classification system originated by S.R. Ranganathan. According to Ranganathan (1962, 81), facets are the fiber that makeup a topic or subject. Facets are used to display the various dimensions (Shera & Egan 1951, 99-100) of a topic or subject and are the manifestations (Shera & Egan 1951, 101) of the five fundamental categories of time, space, energy or action, matter and personality. In the Colon Classification system, facets can be used to represent all aspects of a topic or subject because like the “banyan tree” (Shera & Egan 1951, 100) they can represent topics in “all directions simultaneously”. Facets are intended for groups of users whose interests extend conventional fields (Vickery 1966, 14). In the LIS discipline, facets are identified as a result of facet analysis, a process of analyzing terms in a field of knowledge (Vickery 1960, 12-16). Facets are used in the indexing process in which they provide a precise, flexible and coherent method (Vickery 1966, 16) for indexing. Facet analysis brings together all aspects of the field of knowledge by itemizing the concepts in that field in more detail (Vickery 1966, 16) and providing more flexible combination of terms. Facets and faceted classification have been used extensively and with great benefit in the LIS field. Faceted classification was initially used for print media (Broughton 2006) and later became instrumental in online LIS database environments where it is seen as a superior tool (Broughton 2006) for ensuring effective information retrieval. Recently, the use of facets and faceted classification has become prevalent in the Web domain where it has presented many advantages. Facets can structure Web information and can help the user navigate (Vickery 2008) through this information until the search query is satisfied. Facets can be displayed in any sequence (Vickery 2008) and terms can be combined in any order, providing suggestions for navigational choices and supporting flexible movement (Hearst 2008) through the information space. The use of facets can increase search success (Vickery 2008) by expressing various views of the search term in the hope that the searcher will realize the desired results. Facets have been successfully implemented in the browsing and search features of e-commerce sites, digital museum portals and online library catalogs (La Barre 2008). In addition, the implementation of facets and faceted classification has been successful in the organization and display of information (Broughton 2006; La Barre 2006) on these websites. Implementation of facets and faceted classification in Web search engines, however, is an area still relatively unexplored. The objective of this study was to examine the use of facets in Web search engines. Four Web search engines were used during this study; two search engines that utilized
350 facets or facet-like terms, Exalead and Excite respectively, and two search engines that did not utilize any facets, Google and AltaVista. The Exalead search engine presents facets alongside the linear search results. These facets conform to Ranganathan’s five fundamental categories of personality, matter, energy, space and time (PMEST). For example, when using the Web search engine Exalead to search for the term “Lymphoma”, the user is presented with six facets that appear alongside the linear search results; related terms, multimedia, filetype, site type, countries and language. The “countries” facet displayed correlates to Ranganathan’s (1962, 83) space category that manifests itself as a geographical area. In this case, the “countries” facet lists six geographical sub-facets; United States, United Kingdom, France, Canada, Germany and others that are used to narrow the retrieved information regarding the search term “lymphoma”. The “filetype” facet displayed correlates to Ranganathan’s (1962, 83) matter category that according to Hjørland (2008) is “the physical material of which a subject may be composed”. The “filetype” facet lists Word and PDF as two types of physical manifestations of the retrieved information regarding the subject “lymphoma”. The “site type” and “multimedia” facets displayed correlate to Ranganathan’s (1962, 83) energy category which according to Hjørland (2008), is “an action that occurs with respect to the subject”. In this case, a blog or forum as well as audio can be considered actions or activities (Broughton 2006) that have occurred regarding the subject “Lymphoma”. The “related terms” facet correlates to Ranganathan’s (1962, 83) personality category. The sub-facets listed under this facet present various characteristics of the subject (Hjørland 2008) “Lymphoma”. The language facet displayed does not fall under any of Ranganathan’s fundamental categories and Ranganathan’s fundamental category of time is not represented in the facet display. All the sub-facets listed within the facets displayed are used individually or in combination, within the facet or across facets, to narrow the search results. These intra-facet and inter-facet relationships (Broughton 2006) are a major feature of the faceted scheme. The Excite search engine presents a list of terms alongside the linear search results. These terms are not organized by facets, however, a user can select one of the terms to narrow the search results. Once a term is selected, a new term list is displayed and the process can be repeated. Although this method does not conform to Ranganathan’s PMEST formula, the terms do present a means of exploration and discovery by providing suggestions for additional navigational choices (Hearst 2008). Like facets, these terms present various views of the search term (Vickery 2008) with the intent of guiding the user toward the desired search results. This is one of the fundamental qualities (Ellis & Vasconcelos 1999; Hearst 2008; Vickery 1966; 2008) of facets. Two non-faceted search engines, Google and AltaVista, were also used in this study. These search engines display a list of terms alongside the search results; however, these terms are links to related websites. The user cannot select a term or combine terms to narrow the search results. 2: Related studies A large body of scholarly writing has been generated on the use and benefits of facets in online environments. Researchers believe that the use of facets in these environments offers improved access to online information (La Barre 2006). Facets also offer a powerful approach to organizing (La Barre 2006), browsing (Slaviü 2008) and
351 searching (Ellis & Vasconcelos 1999) online information. Facets are a proven method for exploration and discovery (Hearst 2008) and their use can increase search success (Vickery 2008). Research has been generated (Franklin 2003; Mills 2004; Broughton 2006; La Barre 2006; Uddin & Janecek 2007a; 2007b; Capra et al. 2007; Crystal 2007; Wilson & Schraefel 2008) in the area of facet-enhanced design and organization of corporate and LIS website information. Research has also been generated (Broughton 2001; Yee et al. 2003; Kules et al. 2006; Hong 2006; Gnoli & Hong 2006; Broughton & Slaviü 2007; Capra et al. 2007) on the effects of implementing facets in the information retrieval processes both in the corporate and LIS websites. The design and organization of information on a website is a difficult task (Uddin & Janecek 2007a) however, it is a task that warrants careful attention. Website information that is well organized and easily accessible helps maximize (La Barre 2006) the website information search process for the Web searcher. In recent years, designers and information architects (IA) have been incorporating faceted classification methods (La Barre 2006) into their website designs in an effort to improve website organization and accessibility. The incorporation of faceted classification methods has been implemented through the physical display of “building blocks” (Hjørland 2008) or “facet elements” (Bates 1988). These facets are displayed in a drop-down, windows-based method (Broughton 2006) that has been proven effective for displaying facets given that Web searchers are already familiar with drop-down techniques (Broughton 2006) from using GUIs (graphical user interfaces). From these drop-down displays, Web users choose a combination of facets (La Barre 2006) that help them navigate through website information. Research has shown that the utilization of facets can be effective (La Barre 2006) in improving the navigation process. Research has shown that the implementation of faceted classification methods is beneficial to Web searchers especially to those who are unfamiliar with the search topic (Kules et al. 2006; Yee et al. 2003; Capra et al. 2007). Faceted classification methods facilitate the search and retrieval process by presenting the Web searcher with categorized overviews of the search results (Kules et al. 2006). These overviews enable the Web searcher, who is unfamiliar with the topic, to browse easily through the categories. Research has proven that facet-enhanced searches are a powerful means of retrieving complex and multidimensional contexts, since they allow Web information to be accessed from different facets or points of access (Broughton 2001; Hong 2006; Broughton & Slaviü 2007). Researchers in this area are in agreement that this method is effective in accessing Web information that has multi-dimensional properties (Broughton & Slaviü 2007; Hong 2006; Gnoli & Hong 2006). 3: Methods An experiment was performed to examine the use of facets in Web search engines and to determine whether facets used in Web search engines (1) make the search process easier, (2) extend the search time, or (3) make the search process more confusing. 3.1: Participants Twenty nine participated in the study from two Long Island University academic programs; the master’s in Library and information science (LIS) and the Ph.D. in
352 Information studies (IS). Participants included sixteen LIS students and thirteen IS students. 3.2: Procedure The students were divided into two groups. One group was asked to search for two terms using two facet-enhanced Web search engines (Exalead and Excite) and the other group was asked to search for the same two terms using two non-facet-enhanced Web search engines (Google and AltaVista). One term was more familiar to the students “social networks” (72.4% of all students who participated were very or somewhat familiar with this term) and the other term was less familiar to the students “lymphoma” (51.7% of all students who participated were very or somewhat familiar with this term). The students were given a questionnaire that required them to rate their Web search process. The questionnaire utilized a four-point Likert scale. Students were asked questions concerning the ease of search process, search time and confusion during the search process. The data were collected and a 2x2 and a 2x2x2 factorial analysis performed. T-tests were conducted on the data to determine whether the observed differences were statistically significant. The factors taken into account for the 2x2 analysis were Search term (two levels: Social Network and Lymphoma) and Search Engine type (two levels: Facet and Non-Facet). The factors taken into account for the 2x2x2 analysis were Search term (two levels: Social Network and Lymphoma), Search Engine type (two levels: Facet and Non-Facet) and Type of student (two levels: Master’s (LIS) and Doctoral (IS)). 3.3: Results The results of the 2x2 factor T-test indicate that students using facet-enhanced Web search engines to search for the familiar term “social networks” or the unfamiliar term “lymphoma” found facets made the search process easier. Students found that their search time was extended when searching for the familiar term “social networks”, however, when searching for the unfamiliar term “lymphoma” search time was not extended. When searching for the familiar term “social networks”, students found that facets did not make the search process more confusing. The results of the 2x2x2 factor T-test indicated that IS students found the use of facets did not make the search process easier when searching for the familiar term “social networks”, while LIS students found the use of facets did make the search process easier. Both IS and LIS students found the use of facets extended the search time when searching for the familiar term “social networks”, however, when searching for the unfamiliar term “lymphoma”, both groups found the use of facets did not extend the search time. IS students found facets made the search process more confusing when searching for the familiar term “social networks” while LIS students found facets did not make the search more confusing when searching for the same term. 4: Analysis 4.1: Ease of search process Analysis of study data showed that the use of facets made the search process easier when searching for either familiar or unfamiliar topics. This finding is consistent with
353 the results of two studies (Yee et al. 2003; Kules & Shneiderman 2008) in which participants were asked to utilize a faceted and a non-facet search interface. The results of the Yee et al. (2003) study showed that 90% of the participants preferred the faceted search interface over the non-facet search interface and that 72% of the participants found using the faceted search interface easier than the non-facet search interface. Similarly, the results of the Kules & Shneiderman (2008) study showed that the faceted interface was as easy to use as the non-facet interface. When examining the study data for the two groups of students (IS and LIS), a surprising discrepancy became evident. IS students found facets did not make the search process easier when searching for the familiar term “social networks” while LIS students found that facets did make the search process easier when searching for the familiar term “social networks”. One plausible reason for this discrepancy may be the level of expertise of LIS and IS students in terms of facet utilization. It is believed that a non-expert user will have difficulty using a faceted system (Uddin & Janecek 2007b) because of lack of experience. LIS students have a more sophisticated level of knowledge in terms of facet utilization because of their constant exposure to facets in online library databases. IS students on the other hand, may be considered non-experts in comparison to LIS students because their exposure to facets may not be as extensive as that of LIS students. In addition, most of the IS students in this study do not have MLIS degrees (Smiraglia 2009). 4.2: Search time An examination of all data as well as IS and LIS data showed that when searching for a familiar term “social networks”, the use of facets extended the search time, however, when searching for an unfamiliar term “lymphoma”, the use of facets did not extend the search time. It seems plausible that if participants are exploring new and unfamiliar topics, the use of facets will provide suggestions and steer the participants towards topics that satisfy the search criteria. However, since the participants are unfamiliar with the topic, they may be satisfied with the top level search results and may not choose to delve deeper into the topic. On the other hand, if the topic is familiar, the suggestions for navigational choices (Hearst 2008) will afford the participant the mechanism for deeper exploration and discovery (Hearst 2008) and as a result, extend the search time. 4.3: Confusion during the search process Study data regarding confusion during the search process showed that on the whole, all the participants found the use of facets did not cause confusion during the search process when searching for a familiar topic “social networks”. This finding is consistent with literature that states that facets allow users to move through the information space (Hearst 2008) without confusion and that facets present an easy tool for navigating (Denton 2009) through many dimensions. A discrepancy was found however, when analyzing data of the two groups of students (IS and LIS). IS students found facets made the search process more confusing when searching for the familiar term “social networks” while LIS students found that facets did not make the search process more confusing when searching for the familiar term “social networks”. Perhaps the level of expertise may also play a vital role in terms of confusion since there seems to be a correlation between the responses to questions 3
354 and 5 by IS students. Another plausible factor that might have contributed to the feeling of confusion is the imposed time constraint. Uddin & Janecek (2007a) state that nonexpert users require more time to understand a faceted interface as opposed to a nonfaceted interface. The enforcement of a time constraint may have imposed an added pressure that contributed to the feeling of discomfort and ultimately to the feeling of confusion. 5: Conclusion The results obtained from all the participants as a whole showed the following three significant findings in the area of facet enhanced Web search engines: (1) facets make the search process easier whether searching for familiar or unfamiliar topics, (2) when using facets, it takes longer to search for familiar topics than unfamiliar topics, and (3) when searching for familiar topics, facets do not cause confusion for the searcher. Findings 1 and 3 are well supported by the literature (Denton 2009; Hearst 2008; Kules & Shneiderman 2008; Uddin & Janecek 2007a; 2007b). The second finding is not supported in the literature. When the study data from the two groups of students (IS and LIS) were compared, discrepancies became evident. IS students found that facets did not make the search process easier and were confused when searching for the familiar term “social networks” while LIS students found that facets did make the search process easier and were not confused when searching for the familiar term “social networks”. It is difficult to explain these discrepancies; however, it seems likely that the level of expertise may be a contributory factor in the facet search process. Although this particular finding is not directly supported in any of the literature, it would be interesting to explore this possibility further in future studies. Acknowledgment The author would like to sincerely thank Dr. Richard Smiraglia, for his support, guidance and expertise as well as for his valuable comments and suggestions. She also would like to thank the Palmer School of Library and Information Studies, Long Island University, for allocating travel funds to support participation in this conference. References Bates M., 1988, How to use controlled vocabularies more effectively in online searching, Online, 12, p. 45-56. Broughton V., 2001, Faceted classification as a basis for knowledge organization in a digital environment: the Bliss Bibliographic Classification as a model for vocabulary management and the creation of multi-dimensional knowledge structures, New review of hypermedia and multimedia, 7, p. 67-102. Broughton V., 2006, The need for faceted classification as the basis of all methods of information retrieval, Aslib proceedings, 58, p. 49-72. Broughton V., Slaviü A., 2007, Building a faceted classification for the humanities: principles and procedures, Journal of documentation, 63, p. 727-754. Capra R., Marchionini G., Oh J.S., Stutzman F., Zhang Y., 2007, Effects of structure and interaction style on distinct search tasks. JCDL: Joint conference on digital libraries, Vancouver, 2007, ACM-IEEE, p. 442-451. Crystal A., 2007, Facets are fundamental: rethinking information architecture framework, Technical communication, 54, p. 16-26.
355 Denton W., 2009, How to make a faceted classification and put it on the Web, , accessed 19 June 2009. Ellis D., Vasconcelos A., 1999, Ranganathan and the Net: using facet analysis to search and organize the World Wide Web, Aslib proceedings, 51, p. 3-10. Franklin R.A., 2003, Re-inventing subject access for the Semantic Web, Online information review, 27, p. 94-101. Gnoli C. & Hong M., 2006, Freely faceted classification for Web-based information retrieval, New review of hypermedia and multimedia, 12, p. 63-81. Hearst M.A., 2008, UIs for faceted navigation: recent advances and remaining open problems, in HCIR: Workshop on Computer interaction and information retrieval, Redmond (WA), October 2008, , accessed 15 July 2009. Hjørland B., 2008, Facet, facet analysis and the facet-analytic paradigm in knowledge organization (KO), in Lifeboat for knowledge organization, . Hong M., 2006, Potential usage of faceted classification in Internet “information retrieval”, Interdisciplinary information sciences, 12, p. 43-51. Kules B., Kustanowitz J., Shneiderman B., 2006, Categorizing Web search results into meaningful and stable categories using fast-feature techniques, in Proc. 6th ACM/IEEE-CS joint conference on Digital libraries, Chapel Hill, ACM-IEEE, p. 210-219. Kules B., Shneiderman B., 2008, Users can change their Web search tactics: design guidelines for categorized overviews, Information processing and management, 44, p. 463-484. La Barre K., 2006, The use of faceted analytic-synthetic theory as revealed in the practice of website construction and design, PhD thesis Indiana University. School of Library and Information Science, , accessed 16 October 2010. La Barre K., 2008, Discovery and access systems for website and cultural heritage sites: reconsidering the practical application of facets, in J.T. Tennis & C. Arsenault eds., Culture and identity in knowledge organization: proc. 10th ISKO Conference, Montreal, 5-8 August 2008, Ergon, Würzburg, p. 105-110. Mills J., 2004, Faceted classification and logical division in information retrieval, Library trends, 52, p. 541-570. Ranganathan S.R., 1962, Elements of library classification, Asia Publishing House, Bombay etc. Shera J.H., Egan M.E. eds, 1951, Bibliographic organization. The University of Chicago Press. Smiraglia R.P., 2009 August 8, Milonas ISKO 2010 draft paper rps note, e-mail. Uddin M.N., Janecek P., 2007a, The implementation of faceted classification in Web site searching and browsing, Online information review, 31, p. 218-233. Uddin M.N., Janecek P., 2007b, Performance and usability testing of multi dimensional taxonomy in Web site search and navigation, Performance measurement and metrics, 8, n. 1, p. 18-33. Vickery B.C., 1960, Faceted classification: a guide to construction and use of special schemes, Aslib, London. Vickery B.C., 1966, Faceted classification schemes, Rutgers University Press, New Jersey. Vickery B.C., 2008. Faceted classification for the Web, Axiomathes, 18, n. 2, p. 145-160. Wilson M.L., Schraefel M.C., 2008, A longitudinal study of exploratory and keyword search, in Proc. ACM/IEEE-CS joint conference on Digital libaries, 16-20 June 2008, Pittsburgh, ACM-IEEE, p. 52-55. Yee K.-P., Swearingen K., Li K., Hearst M., 2003, Faceted metadata for image search and browsing, in CHI’03: Proc. 2003 conference on Human factors in computing systems, Fort Lauderdale, ACM, New York, p. 401-408.