rvest Error in open.connection(x, "rb") : Timeout was reached

Question

I'm trying to scrape the content from http://google.com. the error message come out.

library(rvest)  
html("http://google.com")

Error in open.connection(x, "rb") :
Timeout was reached In addition:
Warning message: 'html' is deprecated.
Use 'read_html' instead.
See help("Deprecated")

since I'm using company network ,this maybe caused by firewall or proxy. I try to use set_config ,but not working .

have you also tried the read_html command, since the error message says html is deprecated... This might not solve you problem but maybe the output is more helpful... — drmariod
yes,the message is :Error in open.connection(x, "rb") : Timeout was reached In addition: Warning message: closing unused connection 3 (google.com) — user3267649
actually , this code works fine in my home network. but when I try to use this code in the company network ,the error comes up. — user3267649
Seems not reproducible as a code issue, this returns a result for me. If you figured out what was going on with the network and how to work around it you could post that answer. — Sam Firke
Same issue for me, apparently from the network I am using google asks proof of not being a bot, and the page of course times out when the scraper runs. — Dambo

user799188 user799188 · Accepted Answer · 2017-03-03T01:46:33

I encountered the same Error in open.connection(x, “rb”) : Timeout was reached issue when working behind a proxy in the office network.

Here's what worked for me,

library(rvest)
url = "http://google.com"
download.file(url, destfile = "scrapedpage.html", quiet=TRUE)
content <- read_html("scrapedpage.html")

Credit : https://stackoverflow.com/a/38463559

rvest Error in open.connection(x, "rb") : Timeout was reached

5 Answers