0
votes

I'm scraping with scrapy this url: http://quotes.toscrape.com/

it works great when I do:

response.xpath("//meta[@itemprop='keywords']/@content").extract()
response.xpath("//meta[@itemprop='keywords'][1]/@content").extract_first()

but when I try to get the second meta from that list of metas using the index

response.xpath("//meta[@itemprop='keywords'][2]/@content").extract_first()

it doesn't work.

What am I missing?

Thanks!

1

1 Answers

1
votes

You need to wrap the expression before index in parenthesis:

Instead of:

"//meta[@itemprop='keywords'][2]/@content"

It should be:

"(//meta[@itemprop='keywords'])[2]/@content"

This is needed because you have parameter operators in your xpath.

You can test this:

$ scrapy shell "http://quotes.toscrape.com/"
In [1]: response.xpath("//meta[@itemprop='keywords'][2]/@content").extract_first()

In [2]: response.xpath("(//meta[@itemprop='keywords'])[2]/@content").extract_first()
Out[2]: 'abilities,choices'