Why PDF matters to Information Governance

Excerpt: This Global Information Governance Day it’s time to focus on how better PDF = better IG. #GIGD

About the author: Peter Wyatt is the PDF Association’s CTO and an independent technology consultant with deep file format and parsing expertise, who is a developer and researcher actively working on PDF technologies … Read more

February 17, 2022
by Peter Wyatt

February 17, 2022 marks the 10th anniversary of Global Information Governance Day (GIGD) and is an ideal time to reflect on the impact that the pandemic and changes in hybrid and fully-remote workforces has had on record keeping and good information governance practices.

In 2010, Gartner defined Information Governance (IG) as “the specification of decision rights and an accountability framework to encourage desirable behavior in the valuation, creation, storage, use, archival and deletion of information. It includes the processes, roles, standards and metrics that ensure the effective and efficient use of information in enabling an organization to achieve its goals”. Today PDF, and PDF/A in particular, plays a critical role as the standardized digital file format of choice for storing and managing large quantities of an organization’s information.

The last few years saw seismic shifts in working-from-home, remote access, digital transformation, and other challenges to the previous “business as usual” model. As a result, oversight of information flows and processes has become more challenging. In 2022, producers, users, analysts, managers and retainers of information in modern enterprises should now be asking themselves questions such as:

  • “Do we have new sources of information that need to be managed?”
  • “Are we capturing all necessary information from our remote employees?”
  • “Are our new systems configured to make IG easy/faster/better?”
  • Are we leaking sensitive information and don’t know it?
  • “Are our IG practices and procedures up-to-date with our new ways of working?”

Today the modern workplace is saturated with web technology. From end user technology (websites, Google Docs, Microsoft 365) to enterprise systems (transactional, SharePoint, Oracle) to infrastructure (VOIP, IoT, IaaS) and social media, the range of information inputs, flows, products and systems within organizations large and small continues to expand in both volume and diversity. The challenges for producers, users, analysts, managers and retainers of information in modern enterprises continue to multiply. For end users, some pretty fundamental  with questions are beginning to nag, e.g.:such as:

  • “When does a web page become a record?”
  • “Does a screen-shot of a web-page constitute a record?”
  • “The website doesn’t look like that anymore!”

PDF persists

Accelerated commitments to web- and cloud-based technologies as part of the pandemic response did not dampen the need for digital documents; paginated, deliverable content that works everywhere, whether on a local computer or in the cloud. Invented before the web was a thing, and first marketed as a replacement for overnight document delivery services, PDF remains the medium of choice for formal documents, graphically-rich content, and any content destined for print.  Indeed, PDF is really the ONLY choice; there’s simply no other general-purpose reliable digital document format, nor any on the horizon. Good old PDF, it seems, is good enough… but is it?

Those marking Global Information Governance Day in their calendars might stop to think about how their organizations might better-use PDF technology; an under-acknowledged backbone of digital records.

Better PDF = Better IG

Although the practice of records management and information governance generally is gaining sophistication in terms of technologies such as digital signatures, imaging, search and archiving, corporate document policies and implementation are often lacking. GDPR is raising the stakes in this area, but IG has a lot more to offer than simply avoiding liability for mishandling customer data. Organizations that incorporate IG into their business processes benefit from long-term cost reductions and increased reliability and responsiveness in addition to avoidance of GDPR fines and reputational damage.

The ISO 19005 family of standards defines PDF/A as a formal subset of PDF specifically designed to support reliable long-term preservation and archival:

which provides a mechanism for representing electronic documents in a manner that preserves their static visual appearance over time, independent of the tools and systems used for creating, storing or rendering the files.

When considering how to optimally manage information in PDF, and especially information that must be preserved for a long time, PDF/A is the preferred standard. There are many technologies to choose from that can convert from plain PDF into the PDF/A if business systems cannot create PDF/A directly. Many PDF viewers today are also PDF/A- aware and will stop and warn users from accidentally making changes or edits that might disrupt reliable record keeping. PDF/A Validators (such as the industry-supported VeraPDF) support detailed analysis and validation of PDF/A files to ensure that no matter what system or processes created the PDF/A file, it complies with all aspects of the appropriate ISO standard. And because PDF/A is just PDF, it will happily work alongside other PDF files in document and record management systems.

Today PDF offers far more than just the reliable static visual appearance needed for record keeping. The format directly supports a wide variety of business uses, including (but not limited to):

  • Digital signature workflows;
  • Review, commenting and approval workflows;
  • Professional redaction workflows;
  • Content reuse and accessibility with rich semantics;
  • Interactive forms;
  • 3D content;
  • Movies and rich media;
  • Geospatial content;
  • As a container format enabling application-specific data to be embedded alongside PDF’s portable appearance;
  • …and lots more!

So if your newly digitally transformed processes are not utilizing these PDF features then you are missing out on a lot of value and efficiency.

How the PDF Association helps foster best-practice in IG

GIGD on a globeCommon understandings are the foundation of PDF technology. The PDF Association has served the community of digital document technology developers since 2006 by providing the meeting-place for discussion and development of Portable Document Format technology. Organizations seeking to offer PDF-related software or services benefit from the many technical and marketplace benefits of Full or Partner membership.  PDF Association members-only Technical Working Groups meet regularly to discuss and advance development of authentication, accessibility, 3D, forms and many other features and capabilities of the Portable Document Format.

Think PDF, and think big

If you are an information governance practitioner, this Global Information Management Day, think about how much your post-pandemic organization relies on PDF for documents, contracts, invoices, presentations, receipts, reports, case files, archives and in many other roles. Ask yourself; are we creating the best PDFs we can to meet our IG goals? How could we do better to guarantee reliability and authenticity, make content easier to find and reuse, as well as more accessible for users with disabilities, and more.

Talk to your PDF technology vendor(s) – there are more of them than you think, as all major vendors, including Apple, Google and Microsoft all develop extensive PDF creation and viewing technology. And think big! PDF is extraordinarily capable and flexible. From interactive forms to document imaging, from enterprise search to assistive technology, PDF remains the general-purpose digital document format of choice worldwide for a wide variety of reasons.

So today of all days, it’s time to rethink how your could your organization can get more from PDF.

In noting #GIGD Raf Hens, CTO at iText Software said:

From our experiences with our users over the years, although the initial use case may be limited, we have noticed that requirements and expectations of documents increase over time. For example, if you don’t take data extraction into account in the initial PDF creation use case, it becomes much harder to support that requirement after the fact. That’s why we always strongly recommend taking standards like PDF/A (or PDF/UA) into consideration when generating documents. Since PDF/A documents are self-contained, they are ideal for data storage and archiving purposes.

Read other Global Information Governance Day posts from PDF Association members Orpalis on the latest trends in information governance, and LockLizard on understanding the limitations of encryption in the IG context.