From a pdf file, I am successfully generating 1 png image for each page in the pdf.
The problem is that no matter what setting I use, for some pages GhostScript messes up the font spacing such that in some pngs, one word looks like it is 2 or 3 words.
This is a problem as I am using these files in evernote which messes up expected search results. So a search for "Providers" returns nothing because in the png file, it appears as "Pro vid e rs" (or 'Users' appears as "Use rs").
Dropbox link to a screenshot showing the original text of the source pdf on the left and generated png on the right --> http://dl.dropbox.com/u/13267240/ScreenClip.png
I am new to Ghostscript and am at a loss as to why this is happening.
Here is the command line I am using (in Python):
cmd = "gswin%sc " % (SYS_PROCESSOR_ARCH) + "-q -dNOPAUSE -dBATCH -dPDFFitPage=true -sDEVICE=png16m -r%s " % (PNG_RES) + "-sOutputFile=" + '"%s\%s-pg-%%d.%s" "%s"' % (outputdir, outputFileNamePrefix, suffix, pdfSourceFile)
OR evaluated at runtime:
gswin64c -q -dNOPAUSE -dBATCH -dPDFFitPage=true -sDEVICE=png16m -r300X300 -sOutputFile="C:\EPTK-TMP\02-01-Introduction-pg-%d.png" "C:\EPTK-TMP\02-01-Introduction.pdf"