0
votes

I am working on selenium grid docker to scrape website. If I use only one chrome node means the selenium grid is working if I scale more than one node of chrome selenium grid and the scrapy again it stops working. It just blinks after some time with big error message.

from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
import scrapy
from selenium import webdriver

class ProductSpider(scrapy.Spider):
    name = "product_spider"
    start_urls = ['https://google.com']

    def __init__(self):
        options = webdriver.ChromeOptions()

        options.add_argument('--headless')

        self.driver = webdriver.Remote(command_executor='http://localhost:5000/wd/hub',
            desired_capabilities=DesiredCapabilities.CHROME)


    def parse(self, response):
        data = self.driver.get(response.url)
        print(data,'/////////////')

Then I opened python shell and type the code individual

Python 3.6.5 (default, Apr  1 2018, 05:46:30) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from selenium import webdriver
>>> from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
>>> options = webdriver.ChromeOptions()
>>> options.add_argument('--headless')
>>> driver = webdriver.Remote(command_executor='http://localhost:5000/wd/hub',
...             desired_capabilities=DesiredCapabilities.CHROME)

As you see it stopped in webdriver. Remote .cursor is just blinking for long time then big error message is shown. I think problem is in webdriver.Remote(command_executor='http://localhost:5000/wd/hub', ... desired_capabilities=DesiredCapabilities.CHROME) line.

Can anyone give a solution for this problem Note it's working if selenium grid has one node (chrome) if I scale more than one node (chrome).

This is the error message after long time:

Traceback (most recent call last): File "", line 1, in File "/home/vicky/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 156, in init self.start_session(capabilities, browser_profile) File "/home/vicky/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 251, in start_session response = self.execute(Command.NEW_SESSION, parameters) File "/home/vicky/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 320, in execute self.error_handler.check_response(response) File "/home/vicky/.local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: Error forwarding the new session Error forwarding the request Connect to 172.18.0.8:5555 [/172.18.0.8] failed: Connection timed out (Connection timed out) Stacktrace: at org.openqa.grid.web.servlet.handler.RequestHandler.process (RequestHandler.java:117) at org.openqa.grid.web.servlet.DriverServlet.process (DriverServlet.java:84) at org.openqa.grid.web.servlet.DriverServlet.doPost (DriverServlet.java:68) at javax.servlet.http.HttpServlet.service (HttpServlet.java:707) at javax.servlet.http.HttpServlet.service (HttpServlet.java:790) at org.seleniumhq.jetty9.servlet.ServletHolder.handle (ServletHolder.java:860) at org.seleniumhq.jetty9.servlet.ServletHandler.doHandle (ServletHandler.java:535) at org.seleniumhq.jetty9.server.handler.ScopedHandler.nextHandle (ScopedHandler.java:188) at org.seleniumhq.jetty9.server.session.SessionHandler.doHandle (SessionHandler.java:1595) at org.seleniumhq.jetty9.server.handler.ScopedHandler.nextHandle (ScopedHandler.java:188) at org.seleniumhq.jetty9.server.handler.ContextHandler.doHandle (ContextHandler.java:1253) at org.seleniumhq.jetty9.server.handler.ScopedHandler.nextScope (ScopedHandler.java:168) at org.seleniumhq.jetty9.servlet.ServletHandler.doScope (ServletHandler.java:473) at org.seleniumhq.jetty9.server.session.SessionHandler.doScope (SessionHandler.java:1564) at org.seleniumhq.jetty9.server.handler.ScopedHandler.nextScope (ScopedHandler.java:166) at org.seleniumhq.jetty9.server.handler.ContextHandler.doScope (ContextHandler.java:1155) at org.seleniumhq.jetty9.server.handler.ScopedHandler.handle (ScopedHandler.java:141) at org.seleniumhq.jetty9.server.handler.HandlerWrapper.handle (HandlerWrapper.java:132) at org.seleniumhq.jetty9.server.Server.handle (Server.java:530) at org.seleniumhq.jetty9.server.HttpChannel.handle (HttpChannel.java:347) at org.seleniumhq.jetty9.server.HttpConnection.onFillable (HttpConnection.java:256) at org.seleniumhq.jetty9.io.AbstractConnection$ReadCallback.succeeded (AbstractConnection.java:279) at org.seleniumhq.jetty9.io.FillInterest.fillable (FillInterest.java:102) at org.seleniumhq.jetty9.io.ChannelEndPoint$2.run (ChannelEndPoint.java:124) at org.seleniumhq.jetty9.util.thread.strategy.EatWhatYouKill.doProduce (EatWhatYouKill.java:247) at org.seleniumhq.jetty9.util.thread.strategy.EatWhatYouKill.produce (EatWhatYouKill.java:140) at org.seleniumhq.jetty9.util.thread.strategy.EatWhatYouKill.run (EatWhatYouKill.java:131) at org.seleniumhq.jetty9.util.thread.ReservedThreadExecutor$ReservedThread.run (ReservedThreadExecutor.java:382) at org.seleniumhq.jetty9.util.thread.QueuedThreadPool.runJob (QueuedThreadPool.java:708) at org.seleniumhq.jetty9.util.thread.QueuedThreadPool$2.run (QueuedThreadPool.java:626)

I also attached the selenium grid console screenshot when multiple node is used. link here to see the picture

1

1 Answers

0
votes

It looks like you're starting up new Selenium nodes with Firefox but your tests specifically look for Chrome.

I'd recommend using Zalenium to set up your Selenium Grid: https://github.com/zalando/zalenium