29
votes

What are fast and reliable ways for converting a PDF into a (single) JPEG using the command line on Linux?

4
If you build xpdf from sources it comes with little utilities for things like pdftotext, pdftojpeg, and podftohtml. They might be distributed with some Linux distros but they don't seem to be in this Debian I'm using.Alan Corey
Sorry, they're in poppler-utils. pdfdetach, pdffonts, pdfimages, pdfinfo, pdfseparate, pdfsig, pdftocairo, pdftohtml, pdftoppm, pdftops, pdftotext and pdfunite. Or build xpdf from sources, I'm pretty sure.Alan Corey

4 Answers

41
votes

You can try ImageMagick's convert utility.

On Ubuntu, you can install it with this command:

$ sudo apt-get install imagemagick

Use convert like this:

$ convert input.pdf output.jpg
# For good quality use these parameters
$ convert -density 300 -quality 100 in.pdf out.jpg
32
votes

For the life of me, over the last 5 years, I cannot get imagemagick to work consistently (if at all) for me, and I don't know why people continually recommend it again and again. I just googled how to convert a PDF to a JPEG today, found this answer, and tried convert, and it doesn't work at all for me:

$ convert in.pdf out.jpg
convert-im6.q16: not authorized `in.pdf' @ error/constitute.c/ReadImage/412.
convert-im6.q16: no images defined `out.jpg' @ error/convert.c/ConvertImageCommand/3258.

Then, I remembered there was another tool I use and wrote about, so I googled "linux convert pdf to jpg Gabriel Staples", clicked the first hit, and scrolled down to my answer. Here's what works perfectly for me. This is the basic command format:

pdftoppm -jpeg -r 300 input.pdf output 

The -jpeg sets the output image format to JPG, -r 300 sets the output image resolution to 300 DPI, and the word output will be the prefix to all pages of images, which will be numbered and placed into your current directory you are working in. A better way, in my opinion, however, is to use mkdir -p images first to create an "images" directory, then set the output to images/pg so that all output images will be placed cleanly into the images dir you just created, with the file prefix pg in front of each of their numbers.

Therefore, here are my favorite commands:

  1. [Produces ~1MB-sized files per pg] Output in .jpg format at 300 DPI:

     mkdir -p images && pdftoppm -jpeg -r 300 mypdf.pdf images/pg
    
  2. [Produces ~2MB-sized files per pg] Output in .jpg format at highest quality (least compression) and still at 300 DPI:

     mkdir -p images && pdftoppm -jpeg -jpegopt quality=100 -r 300 mypdf.pdf images/pg
    
  3. If you need more resolution, you can try 600 DPI:

     mkdir -p images && pdftoppm -jpeg -r 600 mypdf.pdf images/pg
    
  4. ...or 1200 DPI:

     mkdir -p images && pdftoppm -jpeg -r 1200 mypdf.pdf images/pg
    

See the references below for more details and options.

References:

  1. [my answer] Convert PDF to image with high resolution
  2. [my answer] https://askubuntu.com/questions/150100/extracting-embedded-images-from-a-pdf/1187844#1187844

Keywords: ubuntu linux convert pdf to images; pdf to jpeg; ptdf to tiff; pdf2images; pdf2tiff; pdftoppm; pdftoimages; pdftotiff; pdftopng; pdf2png

8
votes

libvips can convert PDF -> JPEG quickly. It comes with most linux distributions, it's in homebrew on macos, and you can download a windows binary from the libvips site.

This will render the PDF to a JPG at the default DPI (72):

vips copy somefile.pdf somefile.jpg

You can use the dpi option to set some other rendering resolution, eg.:

vips copy somefile.pdf[dpi=600] somefile.jpg

You can pick out pages like this:

vips copy somefile.pdf[dpi=600,page=12] somefile.jpg

Or render five pages starting from page three like this:

vips copy somefile.pdf[dpi=600,page=3,n=5] somefile.jpg

The docs for pdfload have all the options.

With this benchmark image, I see:

$ /usr/bin/time -f %M:%e convert -density 300 r8.pdf[3] x.jpg
276220:2.17
$ /usr/bin/time -f %M:%e pdftoppm -jpeg -r 300 -f 3 -l 3 r8.pdf x.jpg
91160:1.24
$ /usr/bin/time -f %M:%e vips copy r8.pdf[page=3,dpi=300] x.jpg
149572:0.53

So libvips is about 4x faster and needs half the memory, on this test at least.

3
votes

Convert from imagemagick seems do a good job:

convert file.pdf test.jpg

and in case multiple files were generated:

convert test-0.jpg --append test-1.jpg ... --append one.jpg

to generate a single file, where all pages are concatenated.