I have a PDF file that I would like to optimize. I am receiving the file from an outside source so I don't have the means to recreate it from the beginning.
When I open the file in Acrobat and query the resources, it says that the fonts in the file take up 90%+ of the space. If I save the file as postscript and then save the postscript file to an optimized PDF, the file is significantly smaller (upwards of 80% smaller) and the fonts are still embedded.
I am trying to recreate these results with ghostscript. I have tried various permutations of options with pswrite and pdfwrite but what happens is when I do the initial conversion from PDF to Postscript, the text gets converted to an image. When I convert back to PDF the font references are gone so I end up with a PDF file that has 'imaged' text rather than actual fonts.
The file contains 22 embedded custom Type1 fonts which I have. I have added the fonts to the ghostscript search path and proved that ghostscript can find them with:
gs \
-I/home/nauc01
-sFONTPATH=/home/nauc01/fonts/Type1 \
-o 3783QP.pdf \
-sDEVICE=pdfwrite \
-g5950x8420 \
-c "200 700 moveto" \
-c "/3783QP findfont 60 scalefont setfont" \
-c "(TESTING !!!!!!) show showpage"
The resulting file has the font correctly embedded.
I have also tried using ghostscript to go from PDF to PDF like this:
gs \
-sDEVICE=pdfwrite \
-sNOPAUSE \
-I/home/nauc01 \
-dBATCH \
-dCompatibilityLevel=1.4 \
-dPDFSETTINGS=/printer \
-CompressFonts=true \
-dSubsetFonts=true \
-sOutputFile=output.pdf \
input.pdf
but the output is usually larger than the input and I can't view the file in anything but ghostscript (adobe reader gives "Object label badly formatted").
I can't provide the original file because they contain confidential information but I will try to answer any questions that need to be answered regarding them.
Any ideas? Thanks in advance.