rvest + selector gadget return empty list

Question

I'm attempting to scrape political endorsement data from wikipedia tables (a pretty generic scraping task) and the regular process of using rvest on the css path identified by selector gadget is failing.

The wiki page is here, and the css path .jquery-tablesorter:nth-child(11) td seems to select the right part of the page

Armed with the css, I would normally just use rvest to directly access these data, as follows:

"https://en.wikipedia.org/wiki/Endorsements_for_the_Republican_Party_presidential_primaries,_2012" %>% 
   html %>% 
   html_nodes(".jquery-tablesorter:nth-child(11) td")

but this returns:

list()
attr(,"class")
[1] "XMLNodeSet"

Do you have any ideas?

What part of the page are you actually trying trying to get? — Ciarán Tobin

RHertel RHertel · Accepted Answer · 2015-08-24T17:03:15

This might help:

library(rvest)
URL <- "https://en.wikipedia.org/wiki/Endorsements_for_the_Republican_Party_presidential_primaries,_2012" 
tab <- URL %>% read_html %>%  
            html_node("table.wikitable:nth-child(11)") %>% html_table()

This code stores the table that you requested as a dataframe in the variable tab.

> View(tab)

rvest + selector gadget return empty list

2 Answers