0
votes

I write a Script for a massdownload of PDF-files in a Loop. Firefoxbased Selenium-Webdriver stuck after the first Download, so i decide to try Chrome.

With Chrome the Loop is working now. But i can not download the PDF-files. Chrome just show them with pdf_viewer.js in the Browserwindow.

I try different Options like plugins.plugins_list": [{"enabled": False, "name": "Chrome PDF Viewer"}], download.extensions_to_open": "applications/pdf and plugins.always_open_pdf_externally": True. Nothing works.

#!/usr/bin/python

downPath = "/home/user/xxx/"

Y = ["2017", "2018", "2019"]

def main():
    options = webdriver.ChromeOptions()
    profile = {
        "plugins.plugins_list": [{"enabled": False, "name": "Chrome PDF Viewer"}], 
        "download.default_directory": downPath , 
        "download.extensions_to_open": "applications/pdf",
        "plugins.always_open_pdf_externally": True,
        "download.prompt_for_download": False,
        "safebrowsing.enabled": True
        }
    options.add_experimental_option("prefs", profile)
    browser = webdriver.Chrome(options=options)

    driver = webdriver.Chrome()

    driver.get('login-url')
    #LOGIN-Stuff ... 
    time.sleep(4)

    for y in Y:
        print(y)
        for x in range(53):
            url = "https://j.world/files/{0}/filename_{0}-{1:02d}.pdf".format(y, x)
            print(url)
            driver.get(url)
            time.sleep(4)
    driver.quit()

if __name__ == "__main__":
    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options
    import time
    import sys
    import os
    main()

I use Python 3.7.4 with python-selenium 3.141.0-1 from the Arch-Linux Repository.

2

2 Answers

0
votes

you can download file with requests lib or with curl/wget/etc instead of

driver.get(url)
0
votes

I have overlooked that in the code were still fragments of my attempts with firefox.

browser = webdriver.Chrome(options=options)

driver = webdriver.Chrome()

are obviously wrong

i wrote driver = webdriver.Chrome(options=options) and all things work.

excuse my carelessness