5
votes

I am scraping data in R from this page, http://finviz.com/screener.ashx?v=111&f=earningsdate_nextdays5 which displays popup ads. Those ads interfere with script so I'd like to enable the adblocker extension: https://chrome.google.com/webstore/detail/adblock/gighmmpiobklfepjocnamgkkbiglidom

I'm working with code in the RSelenium package documentation here, https://cran.r-project.org/web/packages/RSelenium/RSelenium.pdf

I found the profile directory by opening a chrome browser and navigating to: chrome://version/. This is my usual profile which has the adblocker extension enabled.

However, when I open chrome, no adblocker is there. I looked at this page, http://scottcsims.com/wordpress/?p=450 and he suggests using the add_extension method which doesn't appear to be implemented in RSelenium.

Any idea on how I can get the adblocker enabled in the browser that R opens?

My code so far. Please note, this was done on a mac and of course your username will be different than mine, so be sure to change the first argument in getChromeProfile to what you find in Profile Path on this page, chrome://version/

require(RSelenium)
RSelenium::startServer()
cprof <- getChromeProfile("/Users/<username>/Library/Application Support/Google/Chrome/", "Profile 1")
remDr <<- remoteDriver(browserName = "chrome", extraCapabilities = cprof)
remDr$open()
appURL <- "http://finviz.com/screener.ashx?v=111&f=earningsdate_nextdays5"
remDr$navigate(appURL)
1
Should it be cprof <- getChromeProfile("/Users/<username>/Library/Application Support/Google/Chrome", "Profile 1") . This would open profile Profile 1 from the MAC default chrome profile directory. - jdharrison
Hi JD, that also opens chrome but the extensions do not load with it. - mks212
To clarify if your profile is /Users/<username>/Library/Application Support/Google/Chrome/Profile 1 you would use getChromeProfile("/Users/<username>/Library/Application Support/Google/Chrome", "Profile 1") . If your profile was in /Users/<username>/Library/Application Support/Google/Chrome/Default you would use getChromeProfile("/Users/<username>/Library/Application Support/Google/Chrome", "Default"). - jdharrison
I have updated my sample code to relfect cprof <- getChromeProfile("/Users/<username>/Library/Application Support/Google/Chrome/", "Profile 1"). However, only some of the extensions load, the adblocker is not one of them. Profile 1 is the only Profile in the directory which makes sense because I am the only person that uses this computer. - mks212
The Default profile is called Default not Profile 1 (at least on windows). Why not make a seperate profile for Selenium. Your code works on Windows with the correct path to a profile with adblock installed. MAC I couldn't say if it works. - jdharrison

1 Answers

0
votes

It isn't ablock extension specifically since I like adguard better, but I always use this:

Download adguard FROM A BROWSER OTHER THAN CHROME: https://www.crx4chrome.com/go.php?d=4687&i=158&p=31932&s=1&l=https%3A%2F%2Fclients2.googleusercontent.com%2Fcrx%2Fblobs%2FQwAAAHF3InbmK-wFIemaY3I3BCPa0e33dMYlYToYq-WCs1jSyPlSXnr3dNv-HTinVL8eTmtbBlPjwi-hJEL5_ZnPfXkYphLdiwB7LVwS3slKcj15AMZSmuWuPGYPZfS0woRX9brTIZ8faUYQCg%2Fextension_3_0_13_0.crx

Example download filepath: /Users/admin/Downloads/extension_3_0_13_0.crx

R Code:

library(RSelenium) #install_github("ropensci/RSelenium")
cprof <- list(chromeOptions = 
                list(extensions = 
                       list(base64enc::base64encode("/Users/admin/Downloads/extension_3_0_13_0.crx"))
                ))


rD <- rsDriver(port = 4444L,extraCapabilities=cprof, browser ="chrome",chromever = "latest"))

#if error port used or need to clear port
#rm(rD)
#rm(remDr)
#gc() #then try again

#set timeout preferences with chrome client
remDr <- rD$client
remDr$setTimeout(type = 'page load', milliseconds = 120000)
remDr$setTimeout(type = 'implicit', milliseconds = 120000)