Confused about scrapy and Xpath

Question

I am trying to scrape some data from the following website: https://xrpcharts.ripple.com/

The data I am interested in is Total XRP which you can see immediately below or to the side (depending on your browser) of the circle diagram. So what I first did was inspect the element I am interested in. So I see that it is inside <div class="stat" inside span ng-bind="totalXRP | number:2" class="ng-binding">99,993,056,930.18</span>.

The number 99,993,056,930.18 is what I am interested in.

So I started in a scrapy shell and wrote:

fetch("https://xrpcharts.ripple.com")

I then used chrome to copy the Xpath by right clicking on that place of HTML code, the result chrome gave me was:

/html/body/div[5]/div[3]/div/div/div[2]/div[3]/ul/li[1]/div/span

Then I used the Xpath command to extract the text:

response.xpath('/html/body/div[5]/div[3]/div/div/div[2]/div[3]/ul/li[1]/div/span/text()').extract()

but this gave me an empty list []. I really do not understand what I am doing wrong here. I think I am making an obvious mistake but I dont see it. Thanks in advance!

alecxe alecxe · Accepted Answer · 2017-12-11T19:14:29

The bottom line is: you cannot expect the page you see in the browser to be the same page Scrapy would download and have available to work with. Scrapy is not a browser.

This page is quite dynamic and complex and is constructed with the help of multiple asynchronous requests bringing in both the logic and the data. There is also JavaScript executed in the browser that plays an important role in forming and supporting the HTML document object tree.

Scrapy does not have all these things, the thing you get when you do fetch() is just the very first initial "bare bones" HTML page without all the "dynamic content".

Confused about scrapy and Xpath

1 Answers