8
votes

Does anyone know if it's possible to export HTML to PDF using the screenshot feature in Selenium Firefox WebDriver? I have a webpage which has print specific css which I need to download automatically. I understand that the screenshot feature takes a screenshot of the page as an image, but I was looking for a scalable PDF file which is good for print.

3

3 Answers

5
votes

Screenshots in Selenium are saved as PNG. And PNG and PDF are different kind of formats. So Selenium cannot save your HTML page image directly as a PDF.

But, you could try to insert the PNG screenshot that Selenium takes and add it to a PDF.

Check this answer. Basically, you will need a library (like itext) and do something like:

// Take screenshot
driver.get("http://www.yourwebpage.com");
File screenshot = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
FileUtils.copyFile(screenshot, new File("screenshot.png"));

// Create the PDF
Document document = new Document(PageSize.A4, 20, 20, 20, 20);
PdfWriter.getInstance(document, new FileOutputStream("my_web.pdf"));
document.open();
Image image = Image.getInstance(getClass().getResource("screenshot.png"));
document.add(image);
document.close();

Hope it helps!

EDIT

Since webs can be pretty high, you will probably need to check the documentation to see how you want to set your image in a PDF file.

1
votes

A quick and easy way is to build an HTML file and embed the images as base64 data. You can then use any converter to get the document as a PDF.

An example with Python:

from selenium import webdriver
driver = webdriver.Firefox()
driver.get("https://www.google.co.uk");

# open new file
file = open(r"C:\temp\captures.html", "w")
file.write("<!DOCTYPE html><html><head></head><body width=\"600px\">")

# write image
file.write("<img src=\"data:image/png;base64,")
file.write(driver.get_screenshot_as_base64())
file.write("\">")

# close file
file.write("</body></html>")
file.close()

driver.quit()
1
votes

Webdriver doesn't support "Export As PDF" function.

When you are not bound to Firefox and Webdriver, phantomjs could be an alternative. Phantomjs is a headless browser with the ability to take screenshots as PDF. The browser can be controlled directly by javascript.

Example: http://phantomjs.org/screen-capture.html