The SADIe Transcoding Platform Darren Lunn, Sean Bechhofer and Simon Harper Information Management Group School of Computer Science, University of Manchester Kilburn Building, Oxford Road, Manchester, M13 9PL, UK
[email protected] [sean.bechhofer | simon.harper]@manchester.ac.uk
ABSTRACT The World Wide Web (Web) is a visually complex, dynamic, multimedia system that can be inaccessible to people with visual impairments. SADIe addresses this problem by using Semantic Web technologies to explicate implicit visual structures through a combination of an upper and lower ontology. By identifying elements within the Web page, in addition to the role that those elements play, accurate transcoding can be applied to a diverse range of Websites.
Categories and Subject Descriptors H.5.4 [Information Interfaces and Presentation (e.g., HCI)]: Hypertext/Hypermedia – User Issues
General Terms Human Factors
Keywords World Wide Web, Accessibility, Transcoding, VI Users
1.
INTRODUCTION
People with visual impairments are hindered in their access to information on the Web because it is not designed with their needs in mind. Most designers are mainly concerned with how content is presented on screen, rather than its structure and meaning. It is therefore not suitable for people who use screen readers as these tools make use of the underlying structure of the data, rather than the presentation, to create an audio rendering of the content. Transcoding is a way of transforming Web content so that it can be accessed on a diverse range of devices, including those used by visually impaired users. With heuristic transcoding, tools analyse a page and adapt it based on a set of predefined rules. The AcceSS System asserted that Websites with similar content have the same structure [1].
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. W4A2008 - Challenge, April 21-22, 2008, Beijing, China. Co-Located with the 17th International World Wide Web Conference. Copyright 2008 ACM ...$5.00.
128
For example news Websites, such as CNN1 and BBC2 , have a similar layout. Therefore a series of templates were created that described the layout of a variety of Website genres. Matching a Web page to a template informed the system of the roles of the elements, which allowed suitable transcoding to be applied. By using heuristics AcceSS had the advantage that it could be applied to a wide variety of Websites without the need for additional information about the content. However to accommodate the heterogeneity of Web pages, the templates created were general and inaccuracies occurred during transcoding. Semantic Transcoding is the adaptation of a Web page by using the semantics of the structure or content. The Transcoding Proxy for Nonvisual Web Access System used annotations that identified visual fragments and assigned them a level of importance [2]. With the annotations in place, transcoding could then be applied to the page. For example, users could simplify the page by reorganising the content into a more suitable format. Semantic transcoding tends to produce higher quality transcoded documents because of the additional level of understanding. However, the cost of this is that annotations have to be created by hand, which can be extremely tedious and time consuming [3].
2.
SADIE
The SADIe platform3 provides a solution that combines the benefits of heuristic and semantic transcoding to offer an accurate yet highly scalable transcoding solution. The principle idea behind our approach is that the rendering of a Web page element is closely associated with its role. For example, sighted users know that a list of links is a menu due to the way it is rendered on screen. This rendering information is defined within the Cascading Style Sheet (CSS) and associated with the Extensible Hypertext Markup Language (XHTML) via tag attributes. Therefore, rather than annotate every page, the CSS element role is annotated within an ontology. This reduces the overhead required for annotation, but as the Website content is being annotated via the CSS, the transcoding is accurate. Moreover, multiple Websites can be transcoded because the ontologies are Website specific, providing a flexible and extensible system. The transcoding is driven by an ontology that provides a defined set of terms for classifying the CSS elements within a Website. The ontology consists of two parts. The first is 1
http://www.cnn.com/ http://www.bbc.co.uk/ 3 An experimental prototype is available at: http://hcw.cs.manchester.ac.uk/experiments/sadie/ 2
(a) The ASSETS 2008 Conference Website
(b) The ASSETS 2008 Conference Website After Being Transcoded By SADIe
Figure 1: Comparison of a Standard ASSETS 2008 Conference Website Page and The SADIe Transcoded Version. Web Page Taken From http://www.sigaccess.org/assets08/ on 17th January 2008 an upper ontology containing high level abstract concepts representing the potential roles of Web page elements. The second is a Website specific extension to the upper ontology. This contains the elements found within the CSS plus the roles that they play. The benefit of this approach is that the upper ontology acts as an interface between SADIe and the page we wish to transcode. The roles of the CSS elements are defined by the abstract classes through an extension of the upper ontology. SADIe requests a list of elements that need to be transcoded and the ontology returns all the CSS elements that satisfy the query. SADIe can query any number of Website ontologies because each site specific ontology uses the same upper ontology, providing a consistent interface. The results of the query are then used to apply transcoding to the page. In addition, we also gain high scalability. Defining the roles of CSS elements allows every page within the Website to be transcoded due to tendency of CSS to contain site-wide style definitions that all pages use. For further discussion of the SADIe method and architecture, the reader is directed to [4]. For a practical guide to building SADIe ontologies, please see [5] The aim of SADIe is to improve access to Web content for visually impaired users. This is achieved by transcoding the page into a format more suited to the sequential audio stream generated by the screen reader. SADIe accomplishes this through three operations: Defluff involves removing elements that provide little or no information to the page; Reorder involves reordering the page so that important areas of content appear near the top of the page; Menu displays the menu of the Website at the bottom of the page. SADIe matches the elements of the Web page to the operation that it will perform. For example, if “Defluff” is to be performed then SADIe will query the ontology for a list of all the CSS classes that have been classified as “Removable”. SADIe then traverses the Web page’s Document Object Model (DOM) and removes any element that occurs within the list returned by the ontology query. Figure 1 demonstrates how SADIe can be used to transcode a Web page. Figure 1a shows the front page from ASSETS
129
2008, an ACM international conference. The section marked “Important Information”, which can be considered to be the main content, is surrounded by banners and menus. Such elements hinder a visually impaired user as they attempt to access the main content. Note the link in the top right corner offering a screen reader version. This link passes the Web page through the SADIe Transcoding Proxy, the results of which can be seen in Figure 1b. The clutter has been removed from the page and the “Important Information” section has been promoted to the top of the page for immediate access by a screen reader.
3.
CONCLUSION AND FUTURE WORK
SADIe is an experimental transcoding platform that can provide accurate and flexible Web content adaptations. These adaptations are based upon a classification of the Web page element roles captured within an ontology. Our next goal is to ensure that the adaptations provide benefit visually impaired users accessing the Web. To do this, we are embarking upon a series of user studies to confirm the usefulness of the current transcoding and to establish extensions to the algorithms provided.
4.
REFERENCES
[1] Parmanto, B., Ferrydiansyah, R., Saptono, A., Song, L., Sugiantara, I. W., and Hackett, S. In Proceedings of the International Cross-Disciplinary Workshop on Web Accessibility, 18–25, (2005). [2] Takagi, H. and Asakawa, C. In Proceedings of the fourth international ACM conference on Assistive technologies, 164–171, (2000). [3] Takagi, H., Asakawa, C., Fukuda, K., and Maeda, J. In Proceedings of the fifth international ACM conference on Assistive technologies, 81–88, (2002). [4] Bechhofer, S., Harper, S., and Lunn, D. In Proceedings of The 5th International Semantic Web Conference, (2006). [5] Lunn, D. Building Ontologies For The SADIe Transcoder. The University of Manchester, (2008). http://hcw-eprints.cs.manchester.ac.uk/23/.