2
votes

I'm looking for a way to save a full web page with Selenium and Python, but using a headless browser. And I want the saved page to be completely identical to how the webpage appears when we open it (just like using the "Save as..." feature in the browser.)

I tried this code snippet by Andersson (https://stackoverflow.com/a/42900364) and it works fine, but I want to use a headless browser instead. Is this possible?

from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
import ahk

firefox = FirefoxBinary("C:\\Program Files (x86)\\Mozilla Firefox\\firefox.exe")
from selenium import webdriver

driver = web.Firefox(firefox_binary=firefox)
driver.get("http://www.yahoo.com")
ahk.start()
ahk.ready()
ahk.execute("Send,^s")
ahk.execute("WinWaitActive, Save As,,2")
ahk.execute("WinActivate, Save As")
ahk.execute("Send, C:\\path\\to\\file.htm")
ahk.execute("Send, {Enter}")
1

1 Answers

2
votes

Can you give a try with this code.
I'm using chrome headless browser in this example.

from selenium import webdriver
import io

options = webdriver.ChromeOptions()
options.add_argument("--headless")
driver = webdriver.Chrome("driver/chromedriver.exe", options=options) #Change chromedriver path accordingly
driver.get("https://stackoverflow.com")
driver.implicitly_wait(10)
html = driver.page_source
with io.open(driver.title + ".html", "w", encoding="utf-8") as f:
    f.write(html)
    f.close()
driver.quit()

After successful execution, the html file will be saved in the same directory where this code is being run. This should work exactly as browser "Save As" feature.
Note: Change the path of chromedriver accordingly.