SADIe: Transcoding based on CSS - CiteSeerX

6 downloads 33180 Views 341KB Size Report
requirements can become problems as the Web becomes ever more ... validation, and best practice, however, these rely on designers build- ing new Web ...
SADIe: Transcoding based on CSS Simon Harper, Sean Bechhofer, and Darren Lunn Information Management Group School of Computer Science, University of Manchester Manchester UK

[simon.harper | sean.bechhofer]@manchester.ac.uk

ABSTRACT

However, when working in the real world, there are issues we must face. Empirical evidence suggests that authors and designers will not create separate semantic mark up to sit with standard Extensible Hypertext Markup Language (XHTML) because they see it as an unnecessary overhead. In addition, designers will not compromise their desire to produce “beautiful and effective” web sites. In our conversations with designers the resounding message we receive is:

Visually impaired users are hindered in their efforts to access the World Wide Web (Web) because their information and presentation requirements are different from those of a sighted user. These requirements can become problems as the Web becomes ever more visually centric with regard to presentation and information order / layout, this can (and does) hinder users who need presentationagnostic access to information. Finding semantic information already encoded directly into pages can help to alleviate these problems and support users who wish to understand the meaning as opposed to the presentation and order of the information. Our solution, Structural-Semantics for Accessibility and Device Independence (SADIe) involves building ontologies of Cascading SytleSheets (CSS) and using those ontologies to transform Web pages.

“If there is any kind of overhead above the normal concept creation then we are less likely to implement it. If our design is compromised in any way we will not implement. We create beautiful and effective sites, we’re not information architects.” Many solutions [1] exist which attempt to make good Web pages out of bad ones. These approaches include guidelines, automated validation, and best practice, however, these rely on designers building new Web pages and being concerned enough about usability to follow them. Problems still exist with badly built and old pages and so solutions which aim to change bad content to good content have been developed. These solutions are known collectively as ‘transcoding’. Our proposed approach, known as SADIe [2] is targeted at overcoming the problem of designer ownership and address the problems of current transcoding systems. It is evident that the semantics of the web document are implicitly encoded as terms within the Style-sheet; and are therefore likewise found within XHTML meta tags and associated with the data found in pages through the use of ‘class’ and ‘id’ attributes common to most XHTML elements. An upper level ontology provides basic notions that encapsulate the role of document elements along with properties that can be used to described them. This upper level ontology is defined in isolation from a particular site, providing an abstraction over the document structure. For a particular CSS stylesheet, we provide an extension of the ontology giving the particular characteristics of the elements appearing in that stylesheet. We can consider this extension to be an annotation of the stylesheet elements. In this way, CSS presentation will be unaffected but semantics will be an explicit part of the data. We can then provide tools that consume this information, manipulating the documents and providing appropriate presentations to the user.

Categories and Subject Descriptors H.1.2 [Models and Principles]: User/Machine Systems—Human factors / Human information processing; I.7.2 [Computing Methodologies]: Document Preparation—Hypertext / hypermedia

General Terms Human Factors, Design

Keywords Document Engineering, Tools, Web, Transcoding, Visual Impairment

1.

BLINDNESS IS NOT INACCESSABILITY

Access to, and movement around, complex hypermedia environments, of which the web is the most obvious example, has long been considered an important and major issue in the Web design and usability field. The commonly used slang phrase ‘surfing the web’ implies rapid and free access, pointing to its importance among designers and users alike. It has also been long established that this potentially complex and difficult access is further complicated, and becomes neither rapid nor free, if the user is visually impaired. Blindness then, often equals inaccessibility, but it does not have to. Annotation of web pages provides a mechanism to enhance visually impaired peoples’ access to information on web-pages through an encoding of the meaning of that information. Annotations can then be consumed by tools that restructure or reorganise pages in order to pull out salient information.

2. TRANSCODING THE CSS Recent moves towards a separation of presentation, metadata and information such as CSS has helped to alleviate some of the problems of accesses to complicated visual information by visually impaired users, but there are still many issues to be addressed. For example, consider the ‘Stagecoach Bus’ website (see Figs. 1/2).

Copyright is held by the author/owner(s). ASSETS’06, October 22–25, 2006, Portland, Oregon, USA. ACM 1-59593-290-9/06/0010.

259

server as a proxy to make sure it was user agent neutral. It is worth mentioning that this configuration allows a more flexible approach in the ways SADIe is used. This is because third-party user agents can directly call parts of our transcoding engine and perform the transcoding best suited to the agent and platform. The operation of SADIe is straightforward. Based on an analysis of the visual rendering, and code structure of the XHTML document and CSS, an extension to our upper level ontology is created. This ontology is uploaded into SADIe via it’s Web interface, and once received is send to a reasoner where a set of semantic inferences are made. These inferences relate to error checking and correction, classification, and subsumption of the Web site extension into the upper level ontology. We then produce a short hand for the entire ontology to remove the need for this, rather slow, inference task to be performed on each transcoding request. Now when a request is made through the proxy to a site with a SADIe ontology the three transcoding actions (mentioned above) can be selected, and the page transformed. Now because SADIe uses the CSS, one ontology is all that is required to accurately transcode any one of the million or so blogs on Google blogger.com. Likewise, SADIe can transform every page on the Stagecoach Web site, BBC news, and any other site which uses CSS and has had an ontology created for it.

Figure 1: Stagecoach Bus Web-Site Before Transformation (http://www.stagecoachbus.com)

2.1 The ‘Achilles Heel’ While this method and transformation may seem flexible, logical, scalable, and neat, SADIe does have one ‘Achilles Heel’; the creation of the Web page ontology. This task is, in honesty, unfriendly and only suitable for a knowledge engineer familiar with the semantic Web. However, this does now provide the focus for our continuing work as we build an analysis tool to semi–automate the process. In this way, a novice user / designer can easily answer a set of yes/no questions, generated from their specific XHTML/CSS combination, to automatically build the ontology without expert intervention.

3. ADDITIONAL BENEFITS While we have focused on the benefits SADIe offers to visually impaired users there are additional benefits for many different types of users when structural semantics become explicit. Consider the mobile device user, who needs full access to a page but not in the same visual form as the designers vision. In this case sighted users are ‘handicapped’ by technology as opposed to physiology and therefore can benefit from SADIe technology. By knowing the semantics of the structure of documents mobile devices can more effectively transcode them. For instance in our Stagec oach Bus study the menu structures could be included within the interface of the browser itself, thereby removing them from the screen real-estate. While we have initially implemented 3 transforms more are being added, along the lines mentioned above, to support device independence, multi-modal access, small screened devices and the mobile Web.

Figure 2: Stagecoach Bus Web-Site After Transformation

This site is a model of the standard kind of website that a user may encounter on a regular basis. It uses standards and implements a clear separation of content and presentation, resulting in some visually attractive web pages. however, the site remains relatively inaccessible to visually impaired people. SADIe is designed to address this inaccessibility by accurately manipulating the Document Object Model (DOM) of any XHTML document based on a knowledge of the CSS captured within the ontology. In our current prototype, requests and operations are pre-configured and ‘anchored’ to the buttons on the toolbar. Basic functionalities include: De-Fluff: Remove any unnecessary content from the page; Toggle Menu: Toggle menus on and off to remove the need for skip links; and ReOrder: Bring important page items to the top. SADIe was originally built as a Mozilla Firefox extension to transcode Web content on the client device, in this way the often cited legal issues of removing and reorganising content could be overcome. However, this method of delivery did not provide us with a flexible enough approach to be able to conduct extensive user trials. In our locality, at least, visually disabled users work with Microsoft Internet Explorer on Windows systems coupled with the JAWS screen-reader. In this case we decided to move our experimental solution to the

4. REFERENCES [1] C. Asakawa and H. Takagi. Annotation-based transcoding for nonvisual web access. In Proceedings of the Fourth International ACM Conference on Assistive Technologies, pages 172–179. ACM Press, 2000. [2] S. Harper and S. Bechhofer. Semantic Triage for Increased Accessibility. In e. a. John J. Ritsko, editor, IBM Systems Journal, volume 44, pages 637–648. IBM, New York, USA, August 2005.

260

Suggest Documents