I am trying to scrape the website https://investing.com/ to get technical data for any stocks. I would like to get for "Moving Averages:" & "Technical Indicators:" how many buys and how many sells with different periods :
- 5 hours
- Daily
- Weekly
Here is an image to see data I want to get : https://i.ibb.co/mHpM0Yw/Capture-d-e-cran-2019-08-14-a-00-15-45.png
the url is https://investing.com/equities/credit-agricole-technical
When you navigate to the browser, the period is set to "hourly" and you have to click an another period to get the correct data. The DOM is update after an XML request.
I would like to scrape the page after DOM updated.
Mechanize
I have try to scrape with Mechanize and click on "weekly" and get the DOM to scrape it but i got an error
here is my code :
def mechanize_scraper(url)
agent = Mechanize.new
puts agent.user_agent_alias = 'Mac Safari'
page = agent.get(url)
link = page.link_with(text: 'Weekly')
new_page = link.click
end
url = "https://investing.com/equities/credit-agricole-technical"
mechanize_scraper(url)
here is the error :
Mechanize::UnsupportedSchemeError (Mechanize::UnsupportedSchemeError)
When we inspect the DOM, the link has an its attributes "href" = javascript(void);
<li pairid="407" data-period="week" class="">
<a href="javascript:void(0);">Weekly</a>
</li>
So after some tries and lots of google search, I move on "Watir" to try to scrape.
Watir
here is my code :
def watir_scraper(url)
Watir.default_timeout = 10
browser = Watir::Browser.new
browser.goto(url)
link = browser.link(text: /weekly/).click
pp link
end
url = "https://investing.com/equities/credit-agricole-technical"
watir_scraper(url)
here is the error :
40: from app.rb:47:in `'
39: from app.rb:32:in `watir_scraper'
38: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/watir-6.16.5/lib/watir/elements/element.rb:145:in `click'
37: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/watir-6.16.5/lib/watir/elements/element.rb:789:in `element_call'
36: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/watir-6.16.5/lib/watir/elements/element.rb:154:in `block in click'
35: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/common/element.rb:74:in `click'
34: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/remote/w3c/bridge.rb:371:in `click_element'
33: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/remote/w3c/bridge.rb:567:in `execute'
32: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/remote/bridge.rb:167:in `execute'
31: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/remote/http/common.rb:64:in `call'
30: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/remote/http/default.rb:114:in `request'
29: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/remote/http/common.rb:88:in `create_response'
28: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/remote/http/common.rb:88:in `new'
27: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/remote/response.rb:34:in `initialize'
26: from /Users/remicarette/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/remote/response.rb:72:in `assert_ok'
25: from 25 libsystem_pthread.dylib 0x00007fff5aaa440d thread_start + 13
24: from 24 libsystem_pthread.dylib 0x00007fff5aaa8249 _pthread_start + 66
23: from 23 libsystem_pthread.dylib 0x00007fff5aaa52eb _pthread_body + 126
22: from 22 chromedriver 0x000000010b434e67 chromedriver + 3673703
21: from 21 chromedriver 0x000000010b416014 chromedriver + 3547156
20: from 20 chromedriver 0x000000010b3e0f07 chromedriver + 3329799
19: from 19 chromedriver 0x000000010b3f91b8 chromedriver + 3428792
18: from 18 chromedriver 0x000000010b3cd069 chromedriver + 3248233
17: from 17 chromedriver 0x000000010b3f86d8 chromedriver + 3426008
16: from 16 chromedriver 0x000000010b3f8940 chromedriver + 3426624
15: from 15 chromedriver 0x000000010b3ecc1f chromedriver + 3378207
14: from 14 chromedriver 0x000000010b0ce8a5 chromedriver + 108709
13: from 13 chromedriver 0x000000010b0cd7e2 chromedriver + 104418
12: from 12 chromedriver 0x000000010b0f1bf3 chromedriver + 252915
11: from 11 chromedriver 0x000000010b0fba37 chromedriver + 293431
10: from 10 chromedriver 0x000000010b0f1c4e chromedriver + 253006
9: from 9 chromedriver 0x000000010b0cfa66 chromedriver + 113254
8: from 8 chromedriver 0x000000010b0f1a72 chromedriver + 252530
7: from 7 chromedriver 0x000000010b0cfe66 chromedriver + 114278
6: from 6 chromedriver 0x000000010b0d63fb chromedriver + 140283
5: from 5 chromedriver 0x000000010b0d71a9 chromedriver + 143785
4: from 4 chromedriver 0x000000010b0d8d19 chromedriver + 150809
3: from 3 chromedriver 0x000000010b0da569 chromedriver + 157033
2: from 2 chromedriver 0x000000010b15fcef chromedriver + 703727
1: from 1 chromedriver 0x000000010b3bf133 chromedriver + 3191091 0x000000010b42f129 chromedriver + 3649833: element click intercepted: Element ... is not clickable at point (544, 704). Other element would receive the click: ... (Selenium::WebDriver::Error::ElementClickInterceptedError) (Session info: chrome=76.0.3809.100)
I hope everything can help you to understand my issue. I would like to know if I can scrape datas with Mechanize or Watir. If not, which tools can do the job ?
Thanks a lot !