2021 in PDF, and what’s coming in 2022
Excerpt: This summary recalls developments in 2021 and looks forward to PDF Association activities in 2022.
About the author: The staff of the PDF Association are dedicated to delivering the information, services and value members have come to expect.
- 1 2021 in PDF, and what’s coming in 2022
- 1.1 ISO standards development
- 1.2 PDF Errata
- 1.3 Improved online resources
- 1.4 Publications
- 1.5 PDF Days
- 1.6 3D PDF comes to the PDF Association
- 1.7 EA-PDF: Archiving email using PDF
- 1.8 SafeDocs
- 1.9 The Arlington PDF Model
- 1.10 The PDF Association on GitHub
- 1.11 New Working Groups
Of course, we all came into 2021 hoping that we’d see the end of COVID, and sadly, that’s not the case. As we predicted at the start of the pandemic, however, 2021 has seen a continued acceleration in the use of digital transformation technologies in general, including PDF.
Let’s review some highlights of the past year from the PDF technology and industry perspective, and look forward to 2022.
ISO standards development
In 2021 we continued to deepen the integration of PDF Association working groups with respective ISO committees. Although in-person ISO meetings weren’t possible the various interest-groups nonetheless advanced towards their objectives. PDF/UA-2 development even accelerated as we welcomed new faces to the work, including those focusing on STEM publishing. ISO standards development status is now tracked for easy reference. In other 2021 developments we presented a proposal to ISO on enhanced management of normative references and negotiated an arrangement with ANSI to allow the PDF Association to sell ISO standards directly, including a (modest) discount for PDF Association members.
Coming in 2022: Publication of the first extensions to PDF 2.0.
Originally launched in early 2021 after the publication of ISO 32000-2:2020, PDF 2.0 errata is now the canonical location for industry-agreed solutions to all publicly reported errata in any PDF 2.0 based ISO standard – including PDF/A-4, PDF/X-6, PDF/VT-3 and ECMAscript for PDF 2.0. Backed by a public GitHub repository for capturing issues, the PDF Technical Working Group meets regularly to discuss reported errata and provide timely feedback to the entire PDF industry. All developers – not just members of the PDF Association – are encouraged to post their concerns and monitor the resolutions to ensure a common understanding with seamless interoperability.
Coming in 2022: With ISO TC 171 SC 2 recently agreeing to adopt the PDF Association’s errata process, the industry-agreed resolutions will be reviewed and approved by the responsible ISO committee, giving developers more certainty.
Improved online resources
In 2021 we reorganized the resources available to members and the public on pdfa.org in various ways. We added a technical index with links to all ISO standards, subsets and industry support materials. We overhauled the Feature Support pages, providing a means for members to express their support for the latest ISO standards through their listings in the product showcase on pdfa.org. We added a glossary of PDF terminology to provide end-users with an authoritative reference and a webstore enabling the PDF Association to sell ISO standards directly, including a (modest) discount for members. A new “call to action” feature at the top of every page on pdfa.org creates new opportunities for members’ to drive traffic to their own products, services, and thought-leadership content. Together with the Solution Agent service for end users the PDF Association is working hard to provide marketing value to its members.
Coming in 2022: an expansion of the Feature Support concept to include all PDF subsets and to identify specific features and a new presentations archive to help bring PDF Days and other presentations to users’ attention. We’ll also be improving the newsletter, driving more social engagement and enhancing our outreach. This year we’ll also be reaching out to every member, in person, to start a dialog about how we can help you.
With a variety of other publications continuing in development, 2021 saw new publications in several categories:
- A new PDF 2.0 Application Note on the Use of Object Metadata Streams
- Version 1.1 of the Matterhorn Protocol’s tests for PDF/UA conformance
- The initiation of industry-approved errata to ISO 32000-2:2020 and the PDF 2.0-related standards for PDF/A-4, PDF/X-6, PDF/VT-3 and ECMAscript for PDF 2.0
Coming in 2022: Publishing projects that will reach fruition in the next year include PDF 2.0 examples, PDF VT Application Notes and the first sets of examples and documentation for formal PDF accessibility techniques developed by the PDF Accessibility LWG.
In 2021, much to our chagrin, COVID-19 derailed the second in-person PDF Days event in a row. All sessions presented in their online-only replacements – OctoberPDFest (YouTube playlist) in 2020 and this year’s PDF Days Online 2021 (YouTube playlist), are now freely available on YouTube.
Coming in 2022: We have started planning for a major in-person event in Europe for Q3; we look forward to announcing a location and other details in Q1 of the new year. Between now and this event, additional webinars across a range of technical topics are also planned – if you have a topic you’d like to learn more about please get in touch by emailing firstname.lastname@example.org.
3D PDF comes to the PDF Association
Having taken over PDF’s ISO standards program from the 3D PDF Consortium in 2020, in 2021 the PDF Association took on the balance of the Consortium’s core mission of promoting industry standards for 3D PDF technology. The result is the 3D PDF TWG and 3D PDF User LWG, both led by Boeing’s Stuart Galt. LWG meetings started in November, 2021. We encourage PDF Association members with interest in the manufacturing space, particularly the aerospace, automotive and AEC industries, to monitor and participate in these working groups.
End users of 3D PDF technology with an interest in contributing towards development of standards in this area should contact us to get involved with the 3D PDF LWG. Such users may be interested in the freshly-added 3D PDF showcase that includes demonstrations of 3D PDF technology as provided by PDF Association members.
Coming in 2022: 3D PDF TWG meetings begin in January. We anticipate a new Implementation Forum round based on discussions taking place in the 3D PDF User LWG. Members who wish to follow these discussions should login to pdfa.org and join these working groups!
EA-PDF: Archiving email using PDF
After collaborating with archival, government and academic stakeholders in 2019 to develop a set of functional requirements for full-featured, interoperable archiving of email using PDF 2.0 technology, in 2021 the PDF Association contracted with the University of Illinois to develop a technical specification for interoperable capture of email to PDF.
Coming in 2022: The EA-PDF LWG, co-chaired by PDF Association CTO Peter Wyatt and Associate Dean Chris Prom of the University of Illinois, will be meeting every two weeks for the next 12-18 months to develop the EA-PDF specification.
In 2021 the PDF Association continued its participation in DARPA’s fundamental research program focussing on PDF technology. CTO Peter Wyatt, the PDF Association’s Principal Investigator supporting this research, provided an update on that program’s activities and accomplishments. More recently, SafeDocs identified a long-standing ambiguity in PDF related to inline image abbreviated keys.
Coming in 2022: The PDF Association will continue its work on Phase 2 of this program. In addition to improvements to existing transitions such as the “Issue Tracker” stressful corpus, we are partnering with NASA-JPL to help transition their work on a “PDF Observatory” for both on-premises and cloud deployment.
The Arlington PDF Model
Developed under the DARPA SafeDocs program and first released in 2021 by PDF Association CTO and ISO 32000 Project Leader Peter Wyatt, the Arlington PDF Model lays the first stepping-stone to an authoritative machine-readable PDF specification. With every PDF object in the latest PDF specification encoded as a TSV file, “Arlington” is the only vendor- and implementation-independent, specification-derived, machine-readable model of the entire PDF document object model (DOM).
This resource is freely available from GitHub, and is already in use by organizations implementing preflight and other analysis software.
Coming in 2022: The Arlington PDF Model will be extended to cover more object integrity relationships, information from previous PDF specifications and other sub-grammars necessary to parse PDF. The proof-of-concept applications will be improved to further demonstrate the power that a machine-readable model can provide. In addition, formalised work on the PDF “Chain of Trust” related to trustworthy parsing of incremental updates and cross-reference tables prior to parsing PDF objects will be developed with other SafeDocs researchers.
The PDF Association on GitHub
Established and now maintained by CTO Peter Wyatt in 2021, the PDF Association’s professional GitHub presence covers a wide variety of subjects across the PDF ecosystem. It’s an essential free resource for all PDF developers, with PDF Association members involved in PDF technical working groups also having access to private repositories to support their work. Visit github.com/pdf-association for more information.
Coming in 2022: Finalized output from several PDF technical working groups, assets from the SafeDocs research program, improvements to the Arlington PDF Model, and new PDF 2.0 example files will be updated and made freely available.
New Working Groups
All PDF Association members can join and participate in any working group; any member can propose a new group as well! Non-members with expertise in respective fields are eligible to participate in Liaison Working Groups (LWGs), and should get in touch by emailing email@example.com to learn more.
In response to requests from members, several new Working Groups launched in 2021, joining the new 3D PDF TWG and LWG. These are:
LaTeX Project LWG
In 2021, the LaTeX Project’s desire to automatically generate tagged PDF from LaTeX resulted in a request for the PDF Association to host a Liaison Working Group to help contextualize and accelerate this work. Beyond the LaTeX Project LWG’s own work its members have been contributing to tagged PDF development across the many working groups focusing on this subject.
Rich Media TWG
PDF 2.0 added a new RichMedia annotation type, but did not provide much guidance for implementers. The Rich Media TWG, created in 2021, will examine the audio and video capabilities of this annotation type, and determine best-practice in how to configure RichMedia annotations to identify and capture all the implicit assumptions necessary to correctly display these annotations into the distant future.