Compliance, compatibility, and why some tools are more forgiving of bad PDFs

Excerpt: Unfortunately a lot of PDF files are still being made that don’t comply with the standard. The team at Global Graphics Software have developed their own rules around what Harlequin should do with non-compliant files, and invested many decades of effort in test and development to accept non-compliant files from major applications.

About the author:

Member News

May 13, 2021
by

We added support for native processing of PDF files to the Harlequin RIP® way back in 1997. When we started working on that support we somewhat naïvely assumed that we should implement the written specification and that all would be well. But it was obvious from the very first tests that we performed that we would need to do something a bit more intelligent because a large proportion of PDF files that had been supplied as suitable for production printing did not actually comply with the specification.

Launching a product that would reject many PDF files that could be accepted by other RIPs would be commercial suicide. The fact that, at the time, those other RIPs needed the PDF to be transformed into PostScript first didn’t change the business case.

Unfortunately a lot of PDF files are still being made that don’t comply with the standard, so over the almost a quarter of a century since first launching PDF support we’ve developed our own rules around what Harlequin should do with non-compliant files, and invested many decades of effort in test and development to accept non-compliant files from major applications.

The first rule that we put in place is that Harlequin is not a validation tool. A Harlequin RIP user will have PDF files to be printed, and Harlequin should render those files as long as we can have a high level of confidence that the pages will be rendered as expected.

In other words, we treat both compliance with the PDF standard and compatibility with major PDF creation tools as equally important … and supporting Harlequin RIP users in running profitable businesses as even more so!

The second rule is that silently rendering something incorrectly can be very bad, increasing costs if a reprint is required and causing a print buyer/brand to lose faith in a print service provider/converter. So Harlequin is written to require a reasonably high level of confidence that it can render the file as expected. If a developer opening up the internals of a PDF file couldn’t be sure how it was intended to be rendered then Harlequin should not be rendering it.

We’d expect most other vendors of PDF readers to apply similar logic in their products, and the evidence we’ve seen supports that expectation. The differences between how each product treats invalid PDF are the result of differences in the primary goal of each product, and therefore to the cost of output that is viewed as incorrect.

Consider a PDF viewer for general office or home use, running on a mobile device or PC. The business case for that viewer implies that the most important thing it has to do is to show as much of the information from a PDF file as possible, preferably without worrying the user with warnings or errors. It’s not usually going to be hugely important or costly if the formatting is slightly wrong. You could think of this as being at the opposite end of the scale from a RIP for production printing. In other words, the required level of confidence in accurately rendering the appearance of the page is much lower for the on-screen viewer.

You may have noticed that my description of a viewer could easily be applied to Adobe Reader or Acrobat Pro. Acrobat is also not written primarily as a validation tool, and it’s definitely not appropriate to assume that a PDF file complies with the standard just because it opens in Acrobat. Remember the Acrobat business case, and imagine what the average office user’s response would be if it would not open a significant proportion of PDF files because it flagged them as invalid!