1
votes

Text is missing after convert this pdf to image(png or jpg), but there no any error log.

Use ImageMagick: convert -density 150 -quality 100 "d:/t/pdf/fp.pdf" -alpha Remove "d:/t/pdf/5/fp.png"

Use Ghostscript (testing with version 9.23 and 9.25): gswin64 -dSAFER -dBATCH -dNOPAUSE -r300 -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -sDEVICE=jpeg -sOutputFile=D:\t\pdf\123.jpg D:\t\pdf\fp.pdf

Anyone know what the reason and how to solve it? Thx.

PDF File for testing

image 1 image 2

2
I am using Imagemagick 6.9.10.12 Q16 Mac OSX with Ghostscript 9.24 and JPEG 90 (9c). You PDF has transparency and JPG does not support transparency. This command works fine for me: convert -density 150 fp.pdf -background white -alpha background -quality 100 fp.jpg. I do not know Windows that well, but it appears that you are using Unix / in place of Windows \ for your paths. What version of Imagemagick are you using? What is your version of libjpeg? It could be your version of JPEG. Have you tried saving to PNG?fmw42
I'm testing this convert on the window 10 x64 professional with ImageMagick-7.0.8-Q16 and CORE_RL_jpeg_.dll 1.5.2。Jack.Shang
First, in IM 7, you should use magick rather than convert and also not magick convert. Second check your jpeg delegate version from magick -list format and look at the right of the line starting with JPEG/JPG. Is it 90 or something else? Next, IM 7 is less forgiving about command syntax order. But the proper syntax is to set the density before reading the PDF and then set your quality after reading the PDF. Try magick -density 150 fp.pdf -background white -alpha background -quality 100 fp.jpg. Does that work? If not, then try saving to PNG. Does that work?fmw42

2 Answers

2
votes

There are two CIDFonts (STSong-Light and AdobeKaitiStd-Regular) used but not embedded. This means that a substitute font must be used. When run through Ghostscript this produces the following transcript:

GPL Ghostscript GIT PRERELEASE 9.26 (2018-09-13)
Copyright (C) 2018 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 2.
Page 1
Can't find CID font "AdobeKaitiStd-Regular".
Attempting to substitute CID font /Adobe-GB1 for /AdobeKaitiStd-Regular, see doc
/Use.htm#CIDFontSubstitution.
The substitute CID font "Adobe-GB1" is not provided either. attempting to use fa
llback CIDFont.See doc/Use.htm#CIDFontSubstitution.
Loading a TT font from %rom%Resource/CIDFSubst/DroidSansFallback.ttf to emulate
a CID font Adobe-GB1 ... Done.
Can't find CID font "AdobeKaitiStd-Regular".
Attempting to substitute CID font /Adobe-GB1 for /AdobeKaitiStd-Regular, see doc
/Use.htm#CIDFontSubstitution.
Can't find CID font "AdobeKaitiStd-Regular".
Attempting to substitute CID font /Adobe-GB1 for /AdobeKaitiStd-Regular, see doc
/Use.htm#CIDFontSubstitution.
Loading NimbusSans-Regular font from %rom%Resource/Font/NimbusSans-Regular... 71
35536 5791889 4867288 3488798 3 done.
Can't find CID font "STSong-Light".
Attempting to substitute CID font /Adobe-GB1 for /STSong-Light, see doc/Use.htm#
CIDFontSubstitution.
Loading NimbusMonoPS-Regular font from %rom%Resource/Font/NimbusMonoPS-Regular..
. 10713600 9353422 4987912 3610458 3 done.
   **** Error: Executing Do inside a text block, attempting to recover
               Output may be incorrect.
>>showpage, press <return> to continue<<

So you can see two fonts being substituted, and then a more concrete problem. Your PDF file executes an image operator inside a text block, which is illegal. However for me the output is apparently correct.

[EDIT] There is some odd behaviour here. I downloaded the 64-bit release code last night and tried that, and I do see the error. The back channel transcript contains this :

Can't find CID font "AdobeKaitiStd-Regular".
Attempting to substitute CID font /Adobe-GB1 for /AdobeKaitiStd-Regular, see doc
/Use.htm#CIDFontSubstitution.
Loading NimbusSans-Regular font from %rom%Resource/Font/NimbusSans-Regular... 77
20460 6369217 2670672 1276767 3 done.
   **** Error: can't process embedded font stream,
        attempting to load the font using its name.
               Output may be incorrect.
   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
Loading NimbusMonoPS-Regular font from %rom%Resource/Font/NimbusMonoPS-Regular..
. 11808228 10439970 2690872 1310356 3 done.
   **** Error: Executing Do inside a text block, attempting to recover
               Output may be incorrect.
   **** Error: File did not complete the page properly and may be damaged.
               Output may be incorrect.
Page 2

The key part is "Can't process embedded font stream....' That's why your text is going missing.

When I run the same command line using the current HEAD of our Git repository I don't see this error, and the file runs to completion. So it looks like this was a bug which has been fixed.

You have two options;

1) If you don't mind building the code, clone our Git repository, open the Visual Studio solution file, allow VS to update it to your version, then build Ghostscript. Use that binary.

2) As you've already discovered, don't use SAFER. I should caution you that this is a potentially dangerous setup. As long as you are processing files which you created yourself you should be fine, but please don't use this setup to process random files from untrusted sources, you could be laying yourself open to attack.

[edit 2]

And here's a third option. With 9.25 we started shipping the Resource files with Windows, just as we do with Linux. I suspect that if you add -I"c:/program files/gs/gs9.25/Resource/Init" to the beginning of your command line, it will work even when -dSAFER is true.

BTW its useful to quote the messages from the back channel when you have a problem, it may not tell you much, but it has useful information for PostScript developers.

0
votes

The missing text came back when I removed the parameter -dSAFER. I don't understand why; I can't find the reason in the Ghostscript documentation.

This is my final command line:

gswin64 -dBATCH -dNOPAUSE -r150 -sDEVICE=jpeg -sOutputFile=D:\t\pdf\6\fp%03d.jpg D:\t\pdf\fp.pdf