How to convert (and flatten) PDF documents to images

Updated: December 31, 2021

Say you have a PDF document in your hands. Say this documents needs editing and redacting. You may want to add some bits of information and obscure some other bits of information. Various PDF programs can do this job for you. However, quite often, the new changes are added as layers on top of the original, so people with the right kind of expertise can glean the data from under the redaction markers.

Some time ago, I published a tutorial showing how to flatten PDF files, which basically means putting all of the changes into a single layer. Now, I want to show you another trick, and this is how to convert PDF files into images. This will create a similar effect - flattening, plus the ability to use (only) parts of information contained in the PDF documents. Our tool of the trade will be pdftocairo. In Linux. Let's commence.

Teaser

Get the utility, start converting

Pdftocairo is often bundled as part of a larger set of tools called poppler-utils. There's a pretty good change that your distro already has the utility installed, or if not, it should be available in the repositories. Once we're past this stage, the usage is quite simple - and powerful. For the sake of it, to install pdftocairo, for instance, in Debian- and Ubuntu-based distributions, then:

sudo apt-get install poppler-utils

Convert to images

Pdftocairo has an extensive list of options. The most basic usage is:

pdftocairo -"image format" "source" "target"

For instance:

pdftocairo -png www.dedoimedo.com-crash-book.pdf crash-book-images.png

Pdftocairo will then create a series of images, one per page, and add the numeric suffix to your chosen target file name. I tested how well the program works with my Linux Kernel Crash Book, 182 pages long and with roughly 100 images in it. Not a trivial file. Pdftocairo serialized the conversion, at a speed of about 1-2 pages per second. This wasn't too fast, but the operation completed successfully and without errors.

Pdftocairo 1

Pdftocairo 2

The fidelity of the conversion is good. The PNG files were all of high quality, including any graphics. At this point, you can do anything you like with the images really. Now, if you don't want to convert entire (large) files, you can do individual pages or ranges, e.g. from page 1 (first, -f) to page 19 (last, -l):

pdftocairo -svg -f 1 -l 19 www.dedoimedo.com-crash-book.pdf test.svg

Bonus: SVG files

Now, best of all, pdftocairo also works with the SVG format. If you have PDF files that have interesting graphics embedded in them, like say logos or diagrams, you may want to recreate these as individual, scalable files. Indeed, pdftocairo can also convert documents to SVG. I tried, and the results were pretty good. Fast and elegant.

SVG conversion

Conclusion

Pdftocairo is a simple yet powerful tool. It can help you manage PDF files with ease, allowing you to redact documents or flatten any changes you don't necessary want shown when you share information. Beyond that, you can also use the program to restore old low-resolution document scans by making SVG files, which could help piece together some of the original information that may not necessarily have been preserved during the scanning process to PDF. This is a separate topic, of course, and no, you can't "magic" information where it doesn't exist. But you can definitely try, and pdftocairo sure makes the experiment pretty straightforward.

Hopefully, this tutorial adds another layer of flexibility to your arsenal of privacy and data conversion tools. Pdftocairo is one of those less-known programs that just sit there, waiting to be used. Definitely a handy utility. I've only touched the basic, but you can also play with scale, resolution, crop and combine images, change transparency, and even work with password-protected PDF files. Well, that would be all for today. Happy flattening. And happy new year!

Cheers.

You may also like: