0
votes

I'm using scrapy to crawl all the links and selenium to scrape all the pages. Selenium scraped most of the pages but left a few pages as the page took time to load.

I tried with timeout() , but didn't seem to work, then I tried with execute_script

driver.execute_script("return document.readyState=="complete";")

this also didn't seem to work, then I tried with expected_conditions

WebDriverWait.until(expected_conditions.execute_script("return document.readyState=="complete";"))

but didn't seem to work

I m using firefox browser, phantomJs for Headless Tried using Chrome driver so installed using brew cask install chromedriver but I'm facing this error

raise WebDriverException("Can not connect to the Service %s" % self.path) selenium.common.exceptions.WebDriverException: Message: Can not connect to the Service chromedriver

so back to phantomjs.

Thank you!

3

3 Answers

0
votes

Make use of sleep function which will help you in delaying the running of code in that time webpage will be loaded

0
votes

I had this problem before. I used a while loop with try and except in it. Loop will keep trying to finish the work you have completed. If the page is not loaded then it will go into except which will just pass. But when it enters the try block and executes successful then at the end of the try block you can use a break to cone out of the loop. This worked 100% of the time for me.

0
votes
raise WebDriverException("Can not connect to the Service %s" % self.path) selenium.common.exceptions.WebDriverException: Message: Can not connect to the Service chromedriver

This is raised because your program is unable to connect to the service through given chromedriver.exe, this can happen due to version mismatch or non availability of executable file.

You can resolve it as follows:

  • Check the version of chrome browser you're using on your system, you can check it in chrome settings > About chrome. Then download the chromedriver accordingly here : https://chromedriver.chromium.org/downloads

  • You can store it anywhere, but it's better if you keep it in the same directory as your code. Unzip it and copy it to the respective directory and you're good to go with chromedriver.

  • Uncomment this driver = webdriver.Chrome() or use driver = webdriver.Chrome(executable_path=r'your path here') if it is not in same directory as your program.