1
votes

I have a system that generates large quantities of PostScript files that each contain multiple, multi-page documents. I want to write a script that takes these large PostScript documents and outputs multiple PDF documents from each.

For example one postscript file contains 200 letters to customers, each of which is 10 pages long. This postscript file contains 2000 pages. I want to output from this 1 ps document, 200x 10 page PDFs, one for each customer.

I'm thinking GhostScript is the way to go for this level of document manipulation but I'm not sure the best way to go - Is there a function in GhostScript to take 'pages 1-10' of the input ps file? Do I have to output the entire ps file as 2000 separate ps files (1 per page) then combine them back together again?

Or is there a much simpler way of acheiving my goal with something other than GhostScript?

Many Thanks,
Ben

3

3 Answers

1
votes

Well, you might first make the PS into a PDF object collection (or directly generate a PDF from GhostScript by printing to the PDFWriter device), and then "cut" from the big PDF using pdftk, which would be quite fast.

2
votes

Technically this will be possible in the next release of Ghostscript, or using the HEAD code in the Git repository. It is now possible to switch devices when using pdfwrite which will cause the device to close and complete the current PDF file. Switching back again will start a new one.

Combine this with a BeginPage and/or EndPage procedure in the page device dictionary, and you should be able to do something like what you want.

Caveat; I haven't tried any of this, and it will take some PostScript programming to get it to work.

Because of the nature of PostScript, there is no way to extract the 'N'th page from a file, so there is no way to specify a range of pages.

As lsemi suggests you could first convert to one large PDF file and then extract the ranges you want. Ghostscript is able to use the FirstPage and LastPage switches to do this (unlike PostScript, it is possible to extract a specific page from a PDF file).

0
votes

Create the complete PDF file first with the help of Ghostscript:

 gs \
   -o 2000p.pdf \
   -sDEVICE=pdfwrite \
   -dPDFSETTINGS=/prepress \
    2000p.ps

Use pdftk to extract PDF files with 10 pages each:

 for i in $(seq 0 10 199); do \
     export start=$(( ${i} * 1 + 1 )); \
     export end=$(( ${start} + 9 ));  \
     pdftk \
        2000p.pdf \
        cat ${start}-${end} \
        output pages---${start}..${end}.pdf; \
 done

You can have Ghostscript generate a 2000page sample+test PDF for you by first creating a sample PostScript file named '2000p.ps' with these contents:

 %!PS
 /H1 {/Helvetica findfont 48 scalefont setfont .2 .2 1 setrgbcolor} def
 /pageframe  {1 0 0 setrgbcolor 2 setlinewidth 10 10 575 822 rectstroke} def
 /gopageno {H1 300 700 moveto } def
 1 1 2000 {pageframe gopageno
    4 string cvs
    dup stringwidth pop
    -1 mul 0 rmoveto
    show
    showpage} for

and then run this command:

 gs -o 2000p.pdf -sDEVICE=pdfwrite -g5950x8420 2000p.ps