Web Accessibility Evaluation via XSLT Vicente Luque Centeno, Carlos Delgado Kloos, Jos´e Ma Bl´azquez del Toro1 and Martin Gaedke2 1
Carlos III University of Madrid {vlc,cdk,jmb}@it.uc3m.es 2 University of Karlsruhe
[email protected]
Abstract. Web accessibility rules, i.e., the conditions to be met by Web sites in order to be considered accessible for all, can be (partially) checked automatically in many different ways. Many Web accessibility evaluators have been developed during the last years. For applying the W3C guidelines, their programmers have to apply subjective criteria, thus leading to different interpretations of these guidelines. As a result, it is easy to obtain different evaluation results when different evaluation tools are applied to a common sample page. However, accessibility rules can be better expressed formally and declaratively in rules that assert conditions over the markup. We have found that XSLT can be used to represent templates addressing many accessibility rules involving the markup of Web pages. Even more, we have found that some specific conditions relaying in the prose of the XHTML specification not previously formalized in the XHTML grammar (the official DTD or XML Schemas) could also be formalized in XSLT rules as well. Thus, we have developed WAEX as a Web Accessibility Evaluator in a single XSLT file. Such XSLT file contains 70+ singular accessibility and XHTML-specific rules not previously addressed by the official DTDs or Schemas from W3C.
1. Introduction Most Web pages nowadays are plenty of accessibility barriers, i.e., characteristics that avoid some people having a correct access to Web contents or Web functionality. Most of these Web barriers cause serious problems for people with some sort of disabilities. Examples of disabilities and usual problems found by those users are: 1. Sensorial disabilities: Quite often, images or other multimedia elements in Web pages don’t provide descriptive texts for blind people. Written transcriptions of audio tracks are rarely found by people who can’t hear. Other people may not properly read nor resize small fonts. Others don’t have the ability to recognize some color combinations. 2. Physical disabilities: People who can’t properly manage a keyboard or a mouse use to have problems for interacting with links and forms in Web pages. Mouse-only or keyboard-only navigation, or the form’s small active zones lacking a conveniently associated descriptive text are usually a challenge for people with reduced mobility.
3. Neurological disabilities: Navigation through a Web site can be difficult if it has terms being difficult to understand, it lacks a clear map of the Web site, form fields lack a proper description that explains how each form field expects to be filled in, people find difficulties on trying to repeat or explain a Web navigation path, or, simply if pages inside the same Web site present inconsistencies. 4. Technological disabilities: Users with old operating systems, alternative browsers, a limited keyboard or mouse, a low bandwidth Internet connection, or hardware limitations like a small display or memory, or lacking a specific plugin, usually find problems when navigating. Also users with their brand new hand-held wireless devices find problems as well. Many Web sites have been conceived only for a concrete screen size and resolution, thus raising layout problems on devices with other capabilities. WAI (Web Accessibility Initiative)’s WCAG (Web Content Accessibility Guidelines) 1.0 [1] from W3C has become an important reference for Web accessibility in the Web community. It has been accepted as a set of guidelines that improve accessibility and eliminate barriers on Web sites. The lack of accessibility affects a large amount of people (between 10% and 20%, according to [10]) that frequently find important barriers when trying to navigate through today’s Web sites. Barriers reduce accessibility not only for people with some sort of personal disability. Accessibility is also a major step towards device independent Web design, allowing Web interoperability to be independent from devices, browsers or operating systems. Web accessibility allows having cheaper Web site maintenance and a wider target public. Some governments are also requiring that some Web sites become accessible. The set of the 65 WCAG’s checkpoints that accessible documents have to pass is a very heterogeneous set of constraints, which are difficult to evaluate and repair. Both WCAG 1.0 [1] and the HTML Techniques for the new WCAG 2.0 draft [2] specifications are written at a high abstraction level, which is frequently quite open to subjective interpretation (as too long texts or too many elements), including implicit multi-evaluation conditions or, simply, containing constraints whose detection cannot be easily automated. There are several tools nowadays which can help us to detect accessibility barriers on Web sites. Watchfire [11], Tawdis [12] or HERA [13] are only a few of them. It is well known that all these tools require a person to supervise and complete the results of the evaluation tools because a lot of rules are relayed on manual checks by the user. However, even regarding automatable evaluations only, we still have the problem of different particular interpretations on each tool. This is due to the fact that they were built on top of different subjective interpretations of the W3C’s open-to-interpretation guidelines, without using a formalized version as depicted at [16]. As a result, we can easily find several tools reporting different evaluation results for a same sample page. Even further, it is difficult indeed to find two evaluation tools that evaluate the same conditions. One of the main reasons for this heterogeneity is due to the fact that these tools have been implemented as black boxes whose internal evaluation mecha-
nisms are hidden to the public. Since they have been mostly implemented using procedural programming languages, the conditions being checked are usually unknown for their users. Our approach to solve that problem consists on providing non-hidden declarative rules, readable and reusable by anyone, even for those with very little programming background. The procedural approach for checking document consistency is powerful, but difficult to bring consensus. On the other hand, the declarative and transparent rule-based approach provides consensus rules that many people might easily understand and agree with. This is one of the advantages of using declarative rules in a DTD or XML Schema. Instead of using self-developed software for XML validation, it’s better to use rules that declare elements as optional or required and how they must be nested, then leaving validation as a rule-based checking process. Declarative rules might be used to validate many Web accessibility conditions easily. Our approach lead us to build WAEX [17]: a Web Accessibility Evaluator in a single XSLT file. Because of its XSLT nature, it can only be applied to well formed HTML pages. Fortunately, HTML reparation software like Tidy [15] and the HTML parser of libxml [9] allows us to apply WAEX to almost any Web page. The rest of this paper depicts the most important WAEX’s rules, and is organized as follows: Section 2 deals with accessibility conditions already expressed formally in the XHTML grammar (the DTD or XML Schema) and compares them with the corresponding XSLT templates in WAEX. Section 3 deals with those accessibility conditions expressible in a XHTML grammar, but not expressed indeed in a DTD or Schema. Section 4 covers accessibility conditions not expressible in a XHTML grammar. Section 5 coverts mobileOK Basic tests [5] from W3C. Section 6 contains some conclusions and future work.
2. Accessibility conditions already expressed in the XHTML grammar Some of the most important accessibility issues already involve validation against a public grammar. In fact, the XHTML grammar already represents some well known accessibility checkpoints. Specific rules in the XHTML’s DTD and Schema already require that specific mandatory markup properly appears in XHTML documents. For instance, table 1 contains rules that state that every image must always have an alt attribute (a textual description of the image). Both the DTD and the XML Schema declare the alt attribute as required for every image. However, non XHTML documents still can use XSLT to spot as a barrier those images having no such alt attribute. 3 4 3
4
A similar set of expressions can be used to state that the alt attribute is also required for every area element. Ornamental images containing no information, should have an empty string as the textual alternative, but this attribute must be present anyhow according to WCAG. The fact that the textual alternative is an adequate alternative for the image is another checkpoint which is outside of our scope.
Table 1. Images without alternative text. The img’s alt attribute is mandatory. Sample barrier
Sample corrected
DTD
XML Schema
XSLT
Images with no alt attribute
Not only mandatory attributes, but also mandatory elements can be required by a grammar. For instance, table 2 contains rules to indicate that every document must have a unique title inside the head element.
Table 2. Document without a unique title. Document’s title is mandatory. Sample barrier
...
Sample corrected
...
... DTD
XML Schema
XSLT
Document without a unique title
Validation is also useful for detecting forbidden or deprecated markup within a document. It is important to avoid deprecated markup in order to properly
separate structure from presentation, thus allowing users to customize presentational features (colors, font sizes, ...) according to their personal needs using only CSS rules without modifying the HTML markup. Deprecated elements like font, b (bold), i (italic), tt (tele-type) or center, as well as forbidden elements not in the HTML specification like blink or marquee, can be spotted (as presentational markup that should be left out in favour of CSS) by either a grammar which does not recognize them as valid elements or with an XSLT rule like the one at table 3. Fortunately, these rules have been considered in all examined tools in a similar way.
Table 3. Deprecated elements no longer in use Sample barrier
Sample corrected
DTD
No such nor ... (in new XHTML versions)
XML Schema
No such nor ... (in new XHTML versions)
XSLT
Deprecated elements no longer in use
3. Accessibility conditions expressible (but not expressed) in a XHTML grammar XHTML is not a single language. Since its birth, it has had several versions, starting from Transitional and Strict [3], launching XHTML Basic [4] and ending by XHTML 1.1. Those different languages have different restrictions, some of them had been previously declared in the 1999’s WCAG [1]. For example, the alt attribute is mandatory since XHTML Transitional 1.0. Deprecated elements like font or center were removed in XHTML Strict 1.0 (but still remains in Transitional). XHTML 1.1 introduced some new rules and removed some deprecated features from XHTML Strict 1.0. XHTML Basic (specifically oriented for small devices) does not allow frames, plugins, both server or client side image maps or nested tables for layout. The refinement process from XHTML 1.0 through XHTML 1.1 has already included some accessibility rules into their successive DTDs and/or XML Schemas. However, some refinements still remain uncompleted. For example, WCAG 1.0 states that all frames should have a mandatory title attribute describing the frame’s purpose. Even though this rule is very similar to the one exposed at
table 1, the XHTML grammar still defines this attribute as optional, instead of required. For this reason, rules from table 4 are required.56
Table 4. Frames without description. The frame’s title attribute is mandatory. Sample barrier
Sample corrected
DTD
However, it is defined as #IMPLIED (not required)
XML Schema
However, it is defined as optional (not required)
XSLT
Frames without description.
Table 5 contains a summary of XHTML prohibitions stated in the prose of XHTML specification [3] but not formally expressed in any XHTML grammar. Such restrictions could be easily expressed formally in the XHTML DTD. However, up to the present moment, they have not been formalized by any rule yet. Table 6 contains sample barriers breaking rules from table 5. Even though those snippets validate the official DTD and Schemas, according to the XHTML specification prose, they are not valid XHTML and some of them involve serious accessibility problems.
Table 5. Restrictions in XHTML specification (prose) not expressed in the DTD Container element Forbidden contents
Condition
pre
img, object, big, small, sub, sup
Always forbidden
a, label
a, area, label, input, select, textarea, button
Always forbidden
form
form
Always forbidden
table
table
In XHTML Basic [4]
ins
Any block-type element
If the ins element is inside an inline-type element
5 6
Unless we locally redefine this rule in our own DTD or XML Schema. A similar approach (DTD or Schema redefinition) could be taken for the usage of the noframes element, because the public DTD does not properly require such element even though it is required to be present as an alternative for frames for browsers that can’t support frames.
Table 6. Sample barriers from XHTML specification (prose) not expressed in the DTD Forbidden elements in
pre Nested
focusable
ele-
...
ments
Nested forms
...
Forbidden insertions
New paragraph.
The best of having checkpoints guaranteed by a grammar is that they are very cheap and easy to detect: just choosing a modern grammar to validate documents against to, and using an existing XML-ized validator, may easily spot where barriers belonging to this category appear within a document.
4. Accessibility conditions not expressible in a XHTML grammar There are some accessibility-related conditions that are not expressible indeed in a DTD or XML Schema. And this happens even though they are conditions concerning the syntax of the documents. Many tools have their own internal implementation of these conditions hard-coded within lines of a program coded in an procedural programming language like Java, C or PHP. This easily leads to complex algorithms which (usually) implement something different from the desirable checkpoint (thus providing different evaluation results). Since they are built as black boxes, people find difficulties on understanding these differences. Approaches like [14] have tried to represent such rules in a set of declarative expressions declared in self-developed XML files that represent rules. However those rules can only be evaluated by a self-developed ad-hoc program. Using already existing software was the aim of the WCAG formalization approach [16]. Translating the formalized approach to XSLT implies not only more precise, but also cheaper evaluation because no specific software needs to be developed indeed. Just an XSLT engine needs to be applied. As a result, we have an XSLT-based declarative ruleset which is smaller, more reusable and easier to understand and homogeneize than the equivalent procedural routines. As a starting example, let’s begin with attributes that enhance accessibility, but whose presence is required only under certain circumstances. For instance, table 7 states that every image input, (i.e., input form fields whose type attribute is image) requires the alt attribute (as any other image). However, it would not be nice to declare the alt attribute as mandatory for every input element, because input form fields having a type attribute different of image don’t really expect such alt attribute. Conditions that indicate whether some markup is mandatory or not, are not expressible in DTDs or XML Schemas.
Table 7. Image inputs without alternative text. The alt attribute is mandatory. Sample barrier
Sample corrected
DTD
NOT FEASIBLE!!
XML Schema
NOT FEASIBLE!!
XSLT
Image input with no alt
In fact, conditions that involve some comparison or calculation over element’s or attribute’s content, are not expressible on DTDs or XML Schemas. Other accessibility-related attributes are also required under certain conditions as well, as depicted at table 8. They are summarized, but they can be used just like the expressions in table 7. For instance, the abbr attribute must be used on any th element having a too long text in order to facilitate efficient text-to-speech transformation. Not all tools check these conditions properly.
Table 8. Attributes conditionally required. Container element Required attribute Condition to be mandatory input
alt
If input’s type is ”image”
html
lang
If document’s xml:lang is missing
th
abbr
If table header’s text is too long
table
summary
If table contains tabular data, i.e, it is not a layout-only table
Conditionally required elements are represented in table 9. For instance, any multimedia resource (object element) should have some alternative contents. In order to have manageable data items, fieldset elements should group form fields sharing common characteristics. When the number of options in a combo box (select element) is high, it is a good idea to group them with the optgroup element. Non-layout tables should have a caption element describing the table ... The expressions from table 9 are summarized.
Some other important accessibility rules don’t really involve existence or absence, but misusage of HTML markup. These rules state that some HTML attributes should not be improperly used. For example, it is a very bad practice to trigger JavaScript routines from the href attribute of active elements like links or client-side map areas. It is much better to use event-focused attributes
Table 9. Elements conditionally required Container el- Required element
Condition to be mandatory
ement object
some alternative inside
If object does not contain alternative text
form
fieldset
If too many form fields
select
optgroup
If too many options
table
caption
If table contains tabular data, i.e, it is not a layout-only table
like onclick for that purpose, and leave the href attribute usable for users that have browsers with no JavaScript support. The XSLT template from table 10 detects this bad practice. (Several Web accessibility evaluation tools do not).
Table 10. Not only JavaScript-based navigation. JavaScript code should appear in event attributes. Sample barrier
Sample corrected
DTD
NOT FEASIBLE!!
XML Schema
NOT FEASIBLE!!
XSLT
Improper place for JavaScript.
Other bad practices involving typically misused attributes are collected in table 11. As we can see, it is bad practice to use client side auto-refresh or autoredirect, unadvised emerging windows, improper keyboard short-cuts or unclear tabulation orders.
Table 11. Attributes typically misused Container el- Misused
Misusage condition
ement
attribute
a, area
href
JavaScript code at href attribute, rather than at onclick at-
meta
http-equiv
Badly used for auto-refresh or auto-redirect
input
alt
If input’s type is not ”image”
* (any tag)
target
Badly used for emerging windows
* (any tag)
tabindex
Not a set of unique consecutive numbers
* (any tag)
accesskey
Not a unique character
tribute
However, the expressive power of XSLT becomes clearer when we leave out these previous simple conditions involving the relationship between elements and their contents and we start looking at relationships among elements placed anywhere in the document. For example, let’s write a rule for “each client-side map area should have a redundant link (somewhere in the document) for those users that cannot use client-side maps”. This implies that, for every area element, at least one link in the same document should share the same destination (i.e. point to the same href). In other words, some link in the document should replicate the href attribute for each area.The XSLT expression from table 12 states that any area whose href attribute is not being replicated by a link somewhere else in the document will be spotted as an accessibility barrier (not suited for people who can’t use maps). Similar templates have been implemented to detect form fields without a label element referring them.
Table 12. A redundant link is required for each client-side map area. Sample barrier
Sample corrected
...
DTD
NOT FEASIBLE!!
XML Schema
NOT FEASIBLE!!
XSLT
Missing redundant area’s link
5. Suitability for handheld devices (W3C’s MobileOK) Specific conditions for handheld devices have recently been defined by the W3C at [5]. Many previous conditions apply for handheld devices, but there are also some specific rules only for this domain, as depicted in table 13. For instance, not only maps, frames, and nested tables are forbidden. The number of external resources is limited for each Web page, and the size of images is mandatory to be specified inside the HTML markup.
6. Conclusions and future work The recently developed XHTML’s DTDs and XML Schemas from W3C could collect some more accessibility-related conditions than currently do. They would
Table 13. W3C’s mobileOK specific conditions Avoid maps
Maps are forbidden
Avoid missing dimen- sions for images
Image dimensions are mandatory
Avoid frames
Frames are forbidden
Avoid nested tables
Nested tables are forbidden
Avoid too many exter- 20”> nal resources
Too many external files
be a great place to formalize syntax-related Web accessibility issues, because many Web authors already know how to validate their documents against those formal grammars. However, DTDs and XML Schemas are not powerful enough to formalize Web accessibility syntax-related rules. Anyway, we found that a declarative language like XSLT could be used to fill in the gap where DTDs and XML Schemas didn’t fit. We started reverse-engineering most Web accessibility evaluation tools [6]. We rewrote in the XSLT rules of WAEX all the checkpoints which were according to the W3C specification, rejecting all checkpoints which were not. Those very few ones out of the scope of XSLT, like CSS parsing, image analysis, color combinations, or JavaScript parsing, have been partially implemented by calling already available server-side tools hosted at W3C [7,8]. We soon became aware of the expressing power of this XSLT approach. We became aware that we could easily express syntax-related conditions from already built Web Accessibility evaluators in a fine and clean way, no matter if they were simple or complex. We also realized that we could express conditions not previously being checked. Thus, we included conditions not previously addressed by previous tools: 1. XHTML conditions from the XHTML specification not previously formalized in the corresponding DTD or Schema [3,4] 2. Conditions from the recently published W3C’s MobileOK Basic test suite [5] (suitable for mobile devices).
Acknowledgements We gratefully acknowledge support from the ITACA TSI-2007-65393-C02-01, ITACA 2 TSI-2007-65393-C02-01 and InCare TSI-2006-13390-C02-01 projects of the “Ministerio de Educaci´on y Ciencia” (Spanish ministry).
References 1. W3C Web Content Accessibility Guidelines 1.0 (Recommendation, May 1999) www.w3.org/TR/WCAG10 2. W3C Techniques and Failures for Web Content Accessibility Guidelines 2.0 -W3C Working Draft 17 May 2007 www.w3.org/TR/2007/WD-WCAG20-TECHS-20070517/ 3. W3C XHTML 1.0 T M 1.0 - The Extensible HyperText Markup Language (Second Edition), A Reformulation of HTML 4 in XML 1.0, W3C Recommendation 26 January 2000, revised 1 August 2002 www.w3.org/TR/xhtml1 4. W3C XHTML T M Basic 1.1 - W3C Candidate Recommendation 13 July 2007 www.w3.org/TR/xhtml-basic 5. W3C W3C mobileOK Basic Tests 1.0 W3C W3C Working Draft 25 May 2007 www.w3.org/TR/mobileOK-basic10-tests 6. W3C Web Accessibility Evaluation Tools: Overview www.w3.org/WAI/ER/tools/Overview.html 7. W3C CSS Validation Service jigsaw.w3.org/css-validator 8. W3C HTTP HEAD service cgi.w3.org/cgi-bin/headers 9. Daniel Veillard The XML C parser and toolkit of Gnome www.xmlsoft.org 10. WebAim Introduction to Web Accessibility www.webaim.org/intro 11. Watchfire WebXACT Accessibility tool webxact.watchfire.com 12. CEAPAT, Fundaci´ on CTIC, Spanish Ministry of Employment and Social Affairs (IMSERSO) Online Web accessibility test www.tawdis.net 13. Fundaci´ on SIDAR Accessibility testing with Style www.sidar.org/hera 14. Jean Vanderdonckt, Abdo Beirekdar, Monique Noirhomme-Fraiture Automated Evaluation of Web Usability and Accessibility by Guideline Review 4th Web Engineering International Conference (ICWE 2004), Munich, Lecture Notes in Computer Science, LNCS 3140, Ed. Springer, ISSN 0302-9743, DOI: 10.1007/b99180, pags. 17-30, Munich, Germany, July 28-30 2004 15. Sourceforge HTML parser and pretty printer in Java jtidy.sourceforge.net 16. Vicente Luque Centeno, Carlos Delgado Kloos, Martin Gaedke, Martin Nussbaumer WCAG Formalization with W3C Standards The 14th International World Wide Web Conference (WWW2005), ISBN 1-59593051-5, pags. 1146-1147 Chiba, Japan, May 11-14, 2005 17. Vicente Luque Centeno WAEX: Web Accessibility Evaluator in a single XSLT file www.it.uc3m.es/vlc/waex.html