0
votes

I convert pdf to tiff image with image magick, from 500kb to 4.6mb filesize.

The problem is result of that convertion in tiff image not good. Some text hard to read.

this is my simple command in cli

convert \
pph.pdf \
pph-psd.tiff

PDF Scanned Image : PDF Scanned Image

Tiff Image : TIFF IMAGE

why is this happend and how to convert pdf scanned image to high resolution tiff with best for ocr?

2

2 Answers

2
votes

This has happened because ImageMagick is a raster image processor and it has rasterised your PDF using its default 72dpi grid - which is too coarse for your needs. You need to set a higher density before rasterising:

convert -density 288 input.pdf -compress lzw result.tiff

You may be better off installing Poppler tools and using its pdfimages tool to extract the images.

-2
votes

If you want you can try Coolutils TotalPDFConverter which worked for me.