60
votes

I'm developing an electronic invoicing system, and one of our features is generating PDFs of the invoices, and mailing them. We have multiple templates for invoices, and will create more later, so we decided to use HTML templates, generate HTML document, and then convert it to PDF. But we're facing a problem with wkhtmltopdf, that as far as I know (I've been Googleing for days to find the solution) we cannot simply both use HTML as header/footer, and show page numbers in them.

In a bug report (or such) ( http://code.google.com/p/wkhtmltopdf/issues/detail?id=140 ) I read that with JavaScript it is achievable this combo. But no other information on how to do it can be found on this page, or elsewhere.

It is, of course not so important to force using JavaScript, if with wkhtmltopdf some CSS magic could work, it would be just as awesome, as any other hackish solutions.

Thanks!

8

8 Answers

32
votes

To show the page number and total pages you can use this javascript snippet in your footer or header code:

  var pdfInfo = {};
  var x = document.location.search.substring(1).split('&');
  for (var i in x) { var z = x[i].split('=',2); pdfInfo[z[0]] = unescape(z[1]); }
  function getPdfInfo() {
    var page = pdfInfo.page || 1;
    var pageCount = pdfInfo.topage || 1;
    document.getElementById('pdfkit_page_current').textContent = page;
    document.getElementById('pdfkit_page_count').textContent = pageCount;
  }

And call getPdfInfo with page onload

Of course pdfkit_page_current and pdfkit_page_count will be the two elements that show the numbers.

Snippet taken from here

86
votes

Actually it's much simpler than with the code snippet. You can add the following argument on the command line: --footer-center [page]/[topage].

Like richard mentioned, further variables are in the Footers and Headers section of the documentation.

43
votes

Among a few other parameters, the page number and total page number are passed to the footer HTML as query params, as outlined in the official docs:

... the [page number] arguments are sent to the header/footer html documents in GET fashion.

Source: http://wkhtmltopdf.org/usage/wkhtmltopdf.txt

So the solution is to retrieve these parameters using a bit of JS and rendering them into the HTML template. Here is a complete working example of a footer HTML:

<!doctype html>
<html>
<head>
    <meta charset="utf-8">
    <script>
        function substitutePdfVariables() {

            function getParameterByName(name) {
                var match = RegExp('[?&]' + name + '=([^&]*)').exec(window.location.search);
                return match && decodeURIComponent(match[1].replace(/\+/g, ' '));
            }

            function substitute(name) {
                var value = getParameterByName(name);
                var elements = document.getElementsByClassName(name);

                for (var i = 0; elements && i < elements.length; i++) {
                    elements[i].textContent = value;
                }
            }

            ['frompage', 'topage', 'page', 'webpage', 'section', 'subsection', 'subsubsection']
                .forEach(function(param) {
                    substitute(param);
                });
        }
    </script>
</head>
<body onload="substitutePdfVariables()">
    <p>Page <span class="page"></span> of <span class="topage"></span></p>
</body>
</html>

substitutePdfVariables() is called in body onload. We then get each supported variable from the query string and replace the content in all elements with a matching class name.

21
votes

From the wkhtmltopdf documentation (http://madalgo.au.dk/~jakobt/wkhtmltoxdoc/wkhtmltopdf-0.9.9-doc.html) under the heading "Footers and Headers" there is a code snippet to achieve page numbering:

<html><head><script>
function subst() {
  var vars={};
  var x=document.location.search.substring(1).split('&');
  for(var i in x) {var z=x[i].split('=',2);vars[z[0]] = unescape(z[1]);}
  var x=['frompage','topage','page','webpage','section','subsection','subsubsection'];
  for(var i in x) {
    var y = document.getElementsByClassName(x[i]);
    for(var j=0; j<y.length; ++j) y[j].textContent = vars[x[i]];
  }
}
</script></head><body style="border:0; margin: 0;" onload="subst()">
<table style="border-bottom: 1px solid black; width: 100%">
  <tr>
    <td class="section"></td>
    <td style="text-align:right">
      Page <span class="page"></span> of <span class="topage"></span>
    </td>
  </tr>
</table>
</body></html>

There are also more available variables which can be substituted other than page numbers for use in Headers/Footers.

2
votes

Safe approach, even if you are using XHTML (for example, with thymeleaf). The only difference with other's solution is the use of // tags.

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8"/>
    <script>
        /*<![CDATA[*/
        function subst() {
            var vars = {};
            var query_strings_from_url = document.location.search.substring(1).split('&');
            for (var query_string in query_strings_from_url) {
                if (query_strings_from_url.hasOwnProperty(query_string)) {
                    var temp_var = query_strings_from_url[query_string].split('=', 2);
                    vars[temp_var[0]] = decodeURI(temp_var[1]);
                }
            }
            var css_selector_classes = ['page', 'topage'];
            for (var css_class in css_selector_classes) {
                if (css_selector_classes.hasOwnProperty(css_class)) {
                    var element = document.getElementsByClassName(css_selector_classes[css_class]);
                    for (var j = 0; j < element.length; ++j) {
                        element[j].textContent = vars[css_selector_classes[css_class]];
                    }
                }
            }
        }
        /*]]>*/
    </script>
</head>
<body onload="subst()">
    <div class="page-counter">Page <span class="page"></span> of <span class="topage"></span></div>
</body>

Last note: if using thymeleaf, replace <script> with <script th:inline="javascript">.

2
votes

My example shows how to hide some text on a particular page, for this case it shows the text from page 2 onwards

<span id='pageNumber'>{#pageNum}</span>
<span id='pageNumber2' style="float:right; font-size: 10pt; font-family: 'Myriad ProM', MyriadPro;"><strong>${siniestro.numeroReclamo}</strong></span>
<script>
    var elem = document.getElementById('pageNumber');
    document.getElementById("pageNumber").style.display = "none";
       if (parseInt(elem.innerHTML) <= 1) {
           elem.style.display = 'none';
           document.getElementById("pageNumber2").style.display = "none";
       }
</script>
0
votes

Right From the wkhtmltopdf Docs

Updated for 0.12.6.

Footers And Headers:
Headers and footers can be added to the document by the --header-* and --footer* arguments respectively. In header and footer text string supplied to e.g. --header-left, the following variables will be substituted.

  • [page] Replaced by the number of the pages currently being printed
  • [frompage] Replaced by the number of the first page to be printed
  • [topage] Replaced by the number of the last page to be printed
  • [webpage] Replaced by the URL of the page being printed
  • [section] Replaced by the name of the current section
  • [subsection] Replaced by the name of the current subsection
  • [date] Replaced by the current date in system local format
  • [isodate] Replaced by the current date in ISO 8601 extended format
  • [time] Replaced by the current time in system local format
  • [title] Replaced by the title of the of the current page object
  • [doctitle] Replaced by the title of the output document
  • [sitepage] Replaced by the number of the page in the current site being converted
  • [sitepages] Replaced by the number of pages in the current site being converted

As an example specifying --header-right "Page [page] of [topage]", will result in the text "Page x of y" where x is the number of the current page and y is the number of the last page, to appear in the upper left corner in the document.

Headers and footers can also be supplied with HTML documents. As an example one could specify --header-html header.html, and use the following content in header.html:

<!DOCTYPE html>   
<html>
  <head><script>
    function subst() {
      var vars = {};
      var query_strings_from_url = document.location.search.substring(1).split('&');
      for (var query_string in query_strings_from_url) {
        if (query_strings_from_url.hasOwnProperty(query_string)) {
          var temp_var = query_strings_from_url[query_string].split('=', 2);
          vars[temp_var[0]] = decodeURI(temp_var[1]);
        }
      }
      var css_selector_classes = ['page', 'frompage', 'topage', 'webpage', 'section', 'subsection', 'date', 'isodate', 'time', 'title', 'doctitle', 'sitepage', 'sitepages'];
      for (var css_class in css_selector_classes) {
        if (css_selector_classes.hasOwnProperty(css_class)) {
            var element = document.getElementsByClassName(css_selector_classes[css_class]);
            for (var j = 0; j < element.length; ++j) {
                element[j].textContent = vars[css_selector_classes[css_class]];
            }
        }
      }   
    }
  </script></head>
  <body style="border:0; margin: 0;" onload="subst()">   
    <table style="border-bottom: 1px solid black; width: 100%">
      <tr>
        <td class="section"></td>
        <td style="text-align:right">
          Page <span class="page"></span> of <span class="topage"></span>
        </td>
      </tr>   
    </table>
  </body>
</html>

ProTip

If you are not using certain information like the webpage, section, subsection, subsubsection, then you should remove them. We are generating fairly large PDFs and were running into a segmentation fault at ~1,000 pages.

After a thorough investigation, it came down to removing those unused variables. No we can generate 7,000+ page PDFs without seeing the Segmentation Fault.

-3
votes

The way it SHOULD be done (that is, if wkhtmltopdf supported it) would be using proper CSS Paged Media: http://www.w3.org/TR/css3-gcpm/

I'm looking into what it will take now.