1
votes

iam using ghostscript for converting pdf files to tiff and later to jpg. How can i increase the quality of the output files without creating ~50MB big files per site?

Here are my currently parameters.

-q -sPapersize=a4 -dSAFER -dNOPAUSE -dTextAlphaBits=4 -sDEVICE=tiff24nc -sOutputFile=Output.tif" -r300 Input.pdf -c quit

And an example of a tiff convert

1
Don;t convert to TIF and then to another image format. Convert directly to JPG - Kevin Brown
I would like to create the jpg later from the pdf. I need the tiff to use it with tesseract (ocr) and the jpg to send it over a rest service to my clients. The shown picture is a part of the tiff document. - Cazzador
It depends what you perceive to be the quality problem. 300 dpi is rather low resolution. If you want anti-aliasing (which you clearly do since you have set TextAlphaBits) then instead of that, use the DownscaleFactor, and render the pages at 300 * DownScaleFactor. The anti-aliasing result is better than the old AlphaBits parameters, though it takes longer. Or, explain what you mean by 'quality'. - KenS
Thanks for the tipp KenS, this helps me to convert the files with a better quality. - Cazzador

1 Answers

0
votes

You could ameliorate the worst of the TIFF size issues by adding either -dCompression=pack or -dCompression=lzw to your command line, then you could up the resolution.

If the aim of the TIFF is to produce the best clarity for OCR'ing, I'd question the utility of using text anti-aliasing, though.

Finally, another option to consider would be to use PNG, which is also lossless. The png16m Ghostscript device should give results which are comparable to tiff24nc