The top 4 ways to secure your PDF documents.
You have probably already noticed that the COVID-19 pandemic has resulted in a paradigm shift in how government services are provided. Being able to effectively serve citizens online has proved incredibly challenging for many agencies, and government officials and citizens alike have struggled to cope. The result has been the acceleration of many digital transformation government initiatives that, for years, faced significant obstacles such as limited budgets, legacy systems and bureaucratic processes. COVID-19 has proven indisputably that modernization is necessary, and the digital transformation of government agencies is essential. On the list of top 10 2020 priorities for state CIOsi identified by the National Association of State Chief Information Officers, officials rank digital government as well as innovation and transformation through technology as pressing concerns. Digital government jumped from No. 4 in 2019 to No. 2, just behind the perennial No. 1: cybersecurity.
It’s not too surprising to see them both as pressing needs since a crucial part of digital government is being able to share and process information securely. When you consider how much of that information is contained within PDF documents, then you’ll understand why we are so passionate about properly using PDFs for secure and simple digital transformations.
For government agencies, keeping PDF documents secure is a vital step in keeping citizens’ data secure. Just consider how much of your personal information is shared between companies and government entities in the modern world; for example, your name, date of birth and personal identification numbers are stored across digital boarding passes, receipts and archives. These can be saved by companies, health-care entities, banks and more. This information can be vulnerable to being exposed online, either unintentionally due to a misconfigured databaseii, or maliciously due to hackingiii. But don’t worry, there are a number of ways that you can secure documents and data.
At iText, we take security seriously. For over 20 years, the iText PDF library has been used to create and manipulate PDF documents by banks, corporations and governments worldwide. It’s also an essential part of many digital transformation and automation projects, since the latest version of the iText library, iText 7, features a suite of add-ons to enable intelligent data extraction, secure redaction of confidential information, optical character recognition and much more. Whether you need to get data in or out of PDF documents, we’ve got it covered.
All of this means iText is virtually synonymous with PDF, and its reputation for simplicity, power and scalability means it’s seen as “the developer’s choice”. That’s not to say only developers benefit from what iText offers though. iText lies at the heart of many systems that produce and process PDFs for everyday uses, whether you’re talking online banking statements or airline boarding passes (though you might not have seen one of those lately).
There are many options for securing your documents and data, depending on your needs. The options can be layered to create the right level of security. But which ones are right for you? In this next section, we will take a look at four ways to secure your PDF documents, including optional examples using iText, and discuss the pros and cons of each to help you decide what’s right for your agency.
The first option is encryption, which is a simple way to restrict access to unauthorized users by making the document inaccessible (encrypted) unless the intended recipients have the key/password to unlock the data.
Let’s take a closer look at both password and certificate encryption options:
Password encryption is a simple way to protect information from those without the password. The latest ISO 32000 default for this is 256-bit AES encryption. With password protection, you can easily create an owner password and set controls on what information others can alter (or not alter) in the document, and then create user passwords that will allow anyone with those passwords to open the document and any other controls that you have allowed.
You can see a demo on our website here of using permissions to allow printing, modifying contents, copying and modifying annotations.
Instead of using a password, you can also use a public-private key pair to encrypt your documents. This is a pair of keys that are linked together by complicated mathematical algorithms. These are so complex and mysterious, in fact, that they’re referred to as “mathemagical” by cryptography experts. The public key is, as it says on the tin, public and the other is private. You can use somebody’s public key to encrypt a message (or document) that only the corresponding private key can decrypt. Meaning that the holder of the private key can open the document and nobody else, since the private key is meant to be kept — well — private. This is illustrated in the following image.
Encryption is well-suited for documents that will be emailed or shared with a manageable number of people that you communicate with and trust not to share the password further. However, when you want to distribute a document widely or archive it, you may want to consider redaction instead. Read on for how redaction works.
The second option is redaction, which is also helpful when you want to share the majority of a document’s content but keep personal or classified information from being published publicly. When you are archiving information, you want to ensure that users in the future will be able to access the necessary data. This security feature removes information from a document entirely, similar to the analog “black bar” method that was used in the days of the photocopy machine. With a PDF, simply adding a black bar over the text works for the image of the document, but leaves the metadata intact, making it easy for someone to access your sensitive data.
There have been a number of embarrassing redaction failures over the years, and you can read about some of them in this American Bar Association article. To prevent issues like this, iText came up with pdfSweep, an iText 7 add-on that securely removes content you define as sensitive from the document, including the metadata — making the redaction process similar (but better!) to the analog version.
How does pdfSweep work?
It intervenes as you edit a PDF document with iText 7's document stamping and watermarking tools. After adding a digital "blackout bar" over the sensitive text, image or part of an image, pdfSweep changes the document's rendering instructions, causing the hidden content of your digital document to become impossible to extract. This works for both text and images, affording you full information security.
If you are interested in seeing an example of how pdfSweep works, you can see one on our website here.
Redaction is helpful to protect confidential information in broadly disseminated documents but will not allow you to retrieve the redacted information. If you are looking for something that keeps all information in the document, you can use encryption as we talked about before, or digital signatures, which also allow you to confirm if a document has been tampered with or changed.
A third option is digital signatures, which are a solution to replace wet ink signatures when using digital documents. This concept has been widely adopted and is well integrated into the PDF specification. It is similar to having a public notary stamp in a document that ensures the signatures are legitimate and the document has not been modified. The digital signature essentially captures the intention of the individual to enter into the contract, and the digital signature is used to encrypt the information and confirm the validity of the signed document. The main benefits of using digital signatures include automating and securing your digital document workflow, saving you time, money and headaches. This can be used for official documents, and to show who checked the document on a specific date/time and that the document has not been updated since it was signed.
There may be one or more signatures in a single PDF document. Digital signatures are actually unseen, hashed and encrypted metadata embedded with a certificate in the file. You can also include an optional visible representation of the signature, a signature photo and/or a signing certificate summary to allow all readers to clearly see the document has been signed.
The three main goals of digital signatures are:
Digital signatures can be used to identify the latest version of a document to protect against tampering and can act as a digital signature for a document, just like wet ink. They are ideal for proving the validity of a document and ensuring that content is unaltered. These are often used for transactions and are widely accepted as secure.
For a deeper dive into the subject of digital signatures in PDFs, watch our recorded digital signatures webinar.
You have a number of security options based on your needs. You can set a password or set a key/certificate to encrypt a document. You can remove the sensitive data and then save or share broadly with redaction, or you can confirm that a document has not been tampered with by using digital signatures, while proving the signer is who you think they are.
If you want to learn more, you can watch the webinar we recorded with the PDF Association on “PDF Security: Encryption and Digital Signatures” for a deeper dive into these options.
Finally, our fourth option, one of the most interesting (and relatively unknown) features added to the PDF format is the ability to create portable collections, more commonly known as PDF portfolios, which enables multiple files to be integrated into a single, secured PDF document.
This isn’t to be confused with the “Combine File” option inside many PDF viewers and software though. This feature offers similar functionality but differs in one major respect. Simply combining files means that all the files will be converted to PDF, whereas creating a PDF portfolio preserves the files in their original file format, and you can edit or modify them in their native application without removing them from the portfolio.
The PDF specification includes features that allow the containment and characterization of arbitrary content (such as files commonly found in email attachments) within the PDF file. As noted in this article from the PDF Association, the PDF standard includes embedded-file, metadata, navigation, data protection and accessibility/reuse features in an ISO-standardized, vendor-independent specification. In a similar way that PDF documents can be a container for other types of data, PDF portfolios themselves are also a data container format that enable you to collect many different file types together in a single file.
There are many use cases and applications where PDF portfolios could be ideal. One example is loan application requests where there are forms to fill out and read-only disclosures, or packets for new employees containing information such as health insurance forms and company policy documents in different formats.
They can also be used for non-business applications, too, such as art students who need to submit a portfolio for college. Using a PDF portfolio, they can easily incorporate original images, photographs and videos into a single file without needing to worry about compression artifacts affecting the perception of their work, since unlike a combined PDF where all files are converted to PDF, files contained within the PDF portfolio remain untouched and easily viewed with a supported application.
Crucially, it’s important to note that a PDF digital signature applied to the portfolio covers all the files it contains, whatever file format they are. It should be noted that edits to documents contained within will break the signature, since it covers the whole PDF including the PDF portfolio and its files.
So, if any of the files in the portfolio are changed or updated then a new digital signature would need to be generated for the portfolio to maintain security for the files contained within. On the other hand, that’s preferable to having to generate digital signatures for all the files it contains. Even if you don’t require the level of security provided by a digital signature, using PDF portfolios is still a great idea. You can set a password for the entire portfolio, or for individual PDFs contained within the portfolio.
You can find an article looking at PDF portfolios in more detail, including iText 7 code examples to generate them yourself on our website.
You have a number of security options based on your needs. You can set a password or set a key/certificate to encrypt a document. You can remove the sensitive data and then save or share broadly with redaction. You can confirm that a document has not been tampered with by using digital signatures, while proving the signer is who you think they are, or you can create a secure PDF portfolio of multiple file types.
If you want to learn more, we have some resources that can give you a deeper dive into these options. You can watch the webinar we recorded with the PDF Association on “PDF Security: Encryption and Digital Signatures” for a deeper dive into the first three options, and check out our blog post on PDF portfolios for more information about them, you can also register for our LIVE webinar “Top 4 ways to secure your PDF documents for government agencies” for a deeper dive into these topics.
Never miss a story with the daily Govtech Today Newsletter.
This content is made possible by our sponsors; it is not written by and does not necessarily reflect the views of e.Republic’s editorial staff.