PDF on GAAD: because it’s the content you keep

Excerpt: Tagged PDF is becoming increasingly common. On #GAAD 2022, PDF/UA’s ISO Project Leader takes stock of the progress and highlights what remains.


About the author: Duff Johnson is a veteran of the electronic document technology marketplace. He founded or led several software and services businesses in the electronic document industry since 1996.
Article

May 19, 2022
by Duff Johnson


GAAD logoMay 19, 2022 is the 11th Global Accessibility Awareness Day (GAAD). The purpose of GAAD is to celebrate and inspire awareness of the significance of access to digital content for hundreds of millions of people with disabilities worldwide.

How do we use GAAD to its best effect? On the one hand it’s an opportunity to acknowledge the very real progress in accessibility technology over the decades, and especially since digital content became commonplace.  First and foremost, however, GAAD exists to remind us to reflect on how our practices can serve to either include or exclude a wide range of people.

As chair of the ISO TC 171 SC 2 PDF/UA committee (and its precursor at AIIM) since 2004, Duff Johnson, the CEO of the PDF Association, has led efforts in developing and standardizing digital accessibility for almost two decades. PDF/UA-1, published as ISO 14289-1 in 2014, is considered the “gold standard” for accessible PDF. Recognized by the US federal government as a “preferred” format, over and above “plain PDF”,  the reason is straightforward: PDF/UA conformance represents an authentic – and challenging – test of whether the content of a given PDF file is usefully available to assistive technology.

Unlike HTML, PDF’s origin lies in delivering a precise visual representation of the author’s intent. The familiar objects of logical structure (paragraphs, headings, tables, lists, etc.) are typically taken for granted (if often abused) in HTML, but are entirely optional in PDF.

At the core of PDF/UA are the PDF format’s features known as logical structure and Tagged PDF. Logical structure (ISO 32000, 14.7) provides for the organization of paginated content into a hierarchical tree structure, while Tagged PDF (ISO 32000, 14.8) defines PDF’s structure types (akin to HTML’s <p>, <h1> and other tags). The PDF format does not require either logical structure or tagged PDF be present in a PDF file; “plain PDF” is (and must remain) possible. However, if accessibility is a requirement, as it is for many PDF use cases, then these features are essential.

Today, the need for not only “tagged PDF” but “properly-tagged PDF” is increasingly recognized. For end users, this is typically accomplished by, for example, applying appropriate styles in a word-processor. Beyond software that knows how to make Tagged PDF, accessibility requires authors to use the available tools to indicate their content’s semantics (headings, paragraphs, lists, tables, etc.) when creating.

Some major vendors, including Microsoft and Apple, now export tagged PDF by default; in Apple’s case, without even an “off” switch! On this GAAD, therefore, we can celebrate the fact that Tagged PDF is increasingly commonplace.

PDF/UA support remains the gold standard… and the acid test. Even 8 years after ISO 14289-1:2014 was published, few PDF viewers or PDF-aware assistive technology fully support PDF/UA-1. It’s an immensely complex domain, even for PDF, a complex file format, because Tagged PDF is not even native to PDF’s imaging model. Even so, members of the PDF Association are rising to that challenge.

At present, PDF Association members from around the world are engaged in a variety of industry Working Groups developing content intended for standardization or as some form of industry-agreed best practice.

Our current initiatives include:

PDF/UA Technical Working Group

One of the most active of PDF Association TWGs, this group meets twice-weekly along with the ISO-designated editing committee of ISO TC 171 SC 2 WG 9 assigned to develop PDF/UA-2, which is PDF/UA for PDF 2.0. Its current objective is to deliver PDF/UA-2 to publication sometime in 2023.

PDF Accessibility LWG

As with other Liaison Working Groups, the PDF Accessibility LWG is open to subject matter experts who are not PDF Association members. The mission of this group is to develop a broad set of “pass” and “fail” techniques for accessible PDF consistent with those developed by W3C to support its Web Content Accessibility Guidelines (WCAG). This effort is ongoing, with publication beginning in 2022.

LaTeX Project LWG

The LaTeX Project LWG allows the developers of LaTeX, the mainstream typesetting system for academic and technical documents, to work directly with PDF technology experts to transfer understanding of complex authoring requirements to tagged PDF technology, and gather PDF subject matter expertise for developing tagged PDF support within LaTeX.

Best Practice development

Technology is the (relatively) easy part of PDF accessibility. Helping document authors to choose the right tools and use them correctly to produce accessible results… that’s harder, but the state of affairs is improving.

To help end users grapple with accessibility requirements the PDF Association develops vendor-neutral guidance, developed by its volunteer member committees. The latest versions of this guidance include:

The Matterhorn Protocol – ISO standards cost money and PDF/UA includes a lot of arcane detail. Matterhorn provides an algorithm that distills the essence of PDF/UA into a freely-available list of all the possible ways to fail PDF/UA conformance. The 1.1 edition was posted in April, 2021.

Tagged PDF Best Practice Guide: Syntax – Published in 2019, this guide is intended for developers implementing support for PDF/UA, but it’s also useful for remediators seeking to understand the finer points of tagging.

PDF/UA-1 Reference Suite – Updated to version 1.1 in September, 2020, this set of PDF files model conformance to PDF/UA-1 across a variety of types of content.

IAAP certification exam development: The PDF Association assisted the International Association of Accessibility Professionals in the development of their exam materials relating to PDF technology.

Accessibility and the standardization process itself

Through its role as committee manager of ISO TC 171 SC 2, PDF Association staff are examining ISO documents and processes according to the principles described in ISO’s freely-available Guide 71, and will report to ISO Central Secretariat on its findings this summer.

PDF is ubiquitous; so should be accessible PDF

“As the document format of record, PDF technology has a special obligation to make it easy for authors to do the right thing and choose accessible PDF,” says Duff Johnson, PDF Association CEO and ISO’s Project Leader for ISO 14289 (PDF/UA). “There are many ways in which an industry association can work to foster attention to accessibility within its industry,” said Johnson. “PDF, in particular, has a special obligation to meet the general document technology needs of creators and consumers worldwide. Fortunately, the industry is hard at work addressing these challenges,” he said.

What can PDF Association members involved with PDF/UA and remediation do on GAAD? Get involved in the online conversation, follow #GAAD and explain that ISO-standardized PDF/UA-1 exists to support universal accessibility. If you are doing an event on May 19, 2022, then register it for free on the GAAD website.