4
votes

I have a PDF file that I would like to optimize. I am receiving the file from an outside source so I don't have the means to recreate it from the beginning.

When I open the file in Acrobat and query the resources, it says that the fonts in the file take up 90%+ of the space. If I save the file as postscript and then save the postscript file to an optimized PDF, the file is significantly smaller (upwards of 80% smaller) and the fonts are still embedded.

I am trying to recreate these results with ghostscript. I have tried various permutations of options with pswrite and pdfwrite but what happens is when I do the initial conversion from PDF to Postscript, the text gets converted to an image. When I convert back to PDF the font references are gone so I end up with a PDF file that has 'imaged' text rather than actual fonts.

The file contains 22 embedded custom Type1 fonts which I have. I have added the fonts to the ghostscript search path and proved that ghostscript can find them with:

gs \
 -I/home/nauc01 
 -sFONTPATH=/home/nauc01/fonts/Type1 \
 -o 3783QP.pdf \
 -sDEVICE=pdfwrite \
 -g5950x8420 \
 -c "200 700 moveto" \
 -c "/3783QP findfont 60 scalefont setfont" \
 -c "(TESTING !!!!!!) show showpage"

The resulting file has the font correctly embedded.

I have also tried using ghostscript to go from PDF to PDF like this:

gs \
 -sDEVICE=pdfwrite \
 -sNOPAUSE \
 -I/home/nauc01 \
 -dBATCH \
 -dCompatibilityLevel=1.4 \
 -dPDFSETTINGS=/printer \
 -CompressFonts=true \
 -dSubsetFonts=true \
 -sOutputFile=output.pdf \
  input.pdf

but the output is usually larger than the input and I can't view the file in anything but ghostscript (adobe reader gives "Object label badly formatted").

I can't provide the original file because they contain confidential information but I will try to answer any questions that need to be answered regarding them.

Any ideas? Thanks in advance.

2
If Acrobat does what you need, I don't understand the desire to recreate it with ghostscript. Surely Acrobat can do batch conversions.luser droog
@luserdroog I need to run this in a *nix environment. I only have Acrobat for windows. I believe Acrobat is available for *nix but I was hoping I wouldn't have to purchase something for this job as it will be a temporary solution. I may end up having to purchase it if all else fails. Thanks for the reply.user791194

2 Answers

2
votes

Don't use pswrite. As you've discovered this will render text. instead use the ps2write device which retains fonts and text.

You don't say which version of Ghostscript you are using but I would recommend you use a recent one.

One point; Ghostscript isn't 'optimising' the PDF the way Acrobat does, its re-creating it. The original PDF is fully interpreted to produce a sequence of operations that mark the page, pdfwrite (and ps2write) then make a new file which only has those operations inside.

If you choose to subset fonts, then only the required glyphs will be included. If the original PDF contains extraneous information (Adobe Illustrator, for example, usually embeds a complete copy of the .ai file) then this will be discarded. This may result in a smaller file, or it may not.

Note that pdfwrite does not support compressed xref and some other later features at present, so some files may well get bigger.

I would personally not go via ps2write, since this just adds another layer of prcoessing and discarding of information. I would just use pdfwrite to create a new PDF file. If you find files for which this does not work (using current code) then you should raise a bug report at http://bugs.ghostscript.com so that someone can address the problem.

0
votes

You might want to try the Multivalent Compress tool. It has an (experimental) option to subset embedded fonts that might make your PDF much smaller. It also contains a lot of switches that allow for better compression, sometimes at the cost of quality (JPEG compression of bitmaps, for example).

Unfortunately, the most recent version of Multivalent does no longer include the tools. Google for Multivalent20060102.jar, that version still includes them. To run Compress:

java -classpath /path/to/Multivalent20060102.jar tool.pdf.Compress [options] <pdf file>