Efficient Web Accessibility Testing

Summary

An illustration of the technical processing of a web page
Illustration of technical tools, both analog and electrical © Photos.com

This report details the findings from the evaluation of a number of web accessibility tools, also referred to as accessibility checkers. The evaluation is carried out both in an objective and a subjective manner: Testing involving an extensive number of technical tests has been conducted and is combined with the subjective findings of expert testers, and voices and experiences by other individuals.

The report shows once more that human inspection is crucial to achieve a high degree of web accessibility, and that a dedicated effort must be made to develop a more modern generation of checkers suitable for the latest standards and recommendations, and tailored for the needs of today's testers, developers, and site owners.

Target group

The target group of this document comprises all those involved in the generation of web content, including writers and other content producers, developers and testers, procurers and buyers, businesses and commercial entities, state institutions and organizations, as well as private individuals.

Editorial

This work is the result of a joint project of the consultancy company Unicus AS and the non-profit research institution Norwegian Computing Center (Norsk Regnesentral) dubbed Effitac. Effitac was carried out in the period from April to December 2014, partly with funding from Deltasenteret.

The authors of this work are:

If you find an error, would like to make a suggestion or comment, please drop us a mail.

Content overview

Introduction

Within few years, the Web has become a ubiquitous place where we not only spend our spare time to consume various contents, gather information, socialize, and more, but also reside to work, do business, and visit public services. It is hence absolutely vital for anybody to be able to participate in the online life.

This has early been established as a universal vision, namely the demand for a Web for all. Besides independence of platform and native language, this also applies to a user's physical or mental ability. The underlying principle is also referred to as inclusive design and builds to a great deal on the ideas of accessibility and universal design.

Legislation

The ideas of e-inclusion are embedded in the legislation both on an international level and a national level in a number of countries.

There is the United Nations' Convention on the Rights of Persons with Disabilities, which is being ratified in more and more countries all over the world.

In Norway, e-inclusion is addressed in the Anti-Discrimination and Accessibility Act. The law is enforced by the Regulation regarding universal design of information and communication technology (ICT) solutions for newly developed solutions from July 2014, and for existing solutions from January 2021.

Accessibility criteria

The regulation amending the Anti-Discrimination and Accessibility Act requires, with a few exceptions, conformance with the WCAG guidelines on Level AA. The exceptions comprise the topics audio descriptions and alternatives, and live captions. Without these exceptions, there are 35 so-called success criteria which must be followed in order to comply with the aforementioned Regulation.

WCAG

Web Content Accessibility Guidelines is a W3C recommendation for making web content suitable for people with disabilities. It consists of a checklist with 61 items, all of which are testable. Some of these tests can be carried out automatically, some semi-automatically, and yet others have to be conducted manually.

Testing web accessibility

Regardless of if you are aiming at compliance with WCAG or the Regulation, it is advisable to use web accessibility evaluation tools to ease the burden of testing. A number of accessibility checkers exists at the moment, enabling testers, developers, and content providers to test their content, sites, and webpages. While many choices is good for the competition, the downside is that it becomes difficult to pick The Right Tool. They all have inherent strengths and weaknesses, they hold different quality, and each tools has its own set of limitations.

This document aims at helping you with the proper choice of tool.

Accessibility checkers

We have considered the following list of accessibility checkers:

The reader is pointed to the fact that many more checkers do exist but could not be tested given the limited ressources in this project.

Testing procedure

We evaluated the accessibility checkers in a two-fold approach.

The first part of the evaluation consisted of an objective assessment, where each tool was confronted with an exhaustive number of technical tests.

The second part of the evaluation was a subjective assessment, where three testers report on their experience with a small selection of accessibility checkers and real-life sites.

Please see the respective section for detailed reports.

Results: Subjective & objective assessments

The results of the objective and the subjective assessment can be found in separate documents.

Work flow considerations

As shown in the objective and subjective evaluation, some tools are more complete than others, but (currently) there is no tool which suffices in all situations.

Tools are essential when it comes to computationally intensive tasks, such as the calculation of color contrasts for all elements on a page. They are also useful to find duplicate links, labels or id attributes, and they ease the investigation of events and dynamic, i.e. changing, documents.

However, human inspection is necessary when it comes to semantics, for instance to assess if the page title properly describes the page, if a particular element has either not been marked up at all or with a too generic element, or if elements which belong logically together can be grouped inside a proper element.

We therefore recommend the following straight forward testing procedure. First, a testsuite should be prepared listing all the tests which must be conducted. Then, each test should be classified with regard to whether it is covered by a particular checker, or whether it must be carried out manually. Run all appropriate automatic tools and gather their results. For instance, to detected empty anchors, SortSite can be used, but not Firefox' Acessibility Extension. The number and type of flagged flaws also indicates how much (if any) effort has been put into accessibility, giving further clues for the subsequent steps. Automatic tests can be carried out frequently. Then start the manual inspection with the remaining tests. Manual tests should involve multiple browsers and multiple assistive technologies like screen readers. Due to its time-consuming nature, such expert testing should not be deployed too often to avoid high expenses. Finally, testing with users of varying abilities and disablities should be strongly considered to verify and complete the finding from the automatic and manual testing. This is the most expensive step and should thus be applied with care.

Concluding remarks

The objective evaluation has shown that more modern web accessibility tools are needed, with support for the latest web technologies, and fit for the purpose of frequent, fast, and exhaustive monitoring. A significant effort must also be put into the quest to make the checkers more correct in order to really release testers and developers from the burden of low-efficiency testing. This is not rocket science and can for instance start with improved documentation and logging.

As discussed in the detailed sections on the objective and subjective evaluations, previous research has shown that current automatic accessibility checking tools cover approximately a maximum of 50% of the WCAG 2.0 success criteria, pointing at an insufficient quality of those checkers.

Human inspection is, however, not problem-free either: Even experienced evaluators can produce up to 35% false positives (supposed issues which actually are no accessibility errors), an equally high number of false negatives (misses), and they rarely agree on more than half of the WCAG 2.0 success criteria.

Last but not least, the limitations of the WCAG 2.0 guidelines regarding validity and testability, and thus also the limitations of the Norwegian Regulation regarding Universal Design of ICT, are well known. In order to offer accessible and usable ICT solutions, simple compliance with WCAG is not enough and must therefore always be complemented by expert and in particular user testing.