1
votes

I have some links that I would like to apply an if statement over. The if statement will break when it sees a link in another dataset.

Suppose I have the following:

linkToStopAt_1 = "https://www.fotocasa.es/es/comprar/vivienda/alella/calefaccion-parking-jardin-terraza-trastero-piscina/164324318/d?from=list"

linkToStopAt_2 = "https://www.fotocasa.es/es/comprar/vivienda/alella/calefaccion-parking-jardin-terraza-trastero-piscina/164313177/d?from=list"

linkToStopAt_3 = "https://www.fotocasa.es/es/comprar/vivienda/alella/calefaccion-parking-jardin-terraza-piscina/164295760/d?from=list"

Along with:

listOfLink = c("https://www.fotocasa.es/es/comprar/vivienda/alella/calefaccion-parking-jardin-terraza-trastero-piscina/164348201/d?from=list", 
"https://www.fotocasa.es/es/comprar/vivienda/alella/calefaccion-parking-jardin-terraza-trastero-piscina/164336155/d?from=list", 
"https://www.fotocasa.es/es/comprar/vivienda/alella/aire-acondicionado-terraza-no-amueblado/164327028/d?from=list", 
"https://www.fotocasa.es/es/comprar/vivienda/alella/aire-acondicionado-terraza-no-amueblado/164326907/d?from=list", 
"https://www.fotocasa.es/es/comprar/vivienda/alella/calefaccion-parking-jardin-terraza-piscina/164295760/d?from=list"
)

I am looking for a more compact version of the following:

  if(linkToStopAt_1 %in% listOfLink || linkToStopAt_2 %in% listOfLink || linkToStopAt_3 %in% listOfLink){
    print(paste("something here"))
  }

So, if one of the linkToStopAt_N occurs in listOfLink then we stop / print something. However, I want to expand the OR condition to N. The problem I face is I am applying a function over a set of links and I want the function the break at the first instance it sees a link it already has. It could be that some links are removed, so, the code currently re-collects all of the data again since it has not "seen" that link before, but it has, its just been removed. For instance, the following link lintToStopAt_1.

https://www.fotocasa.es/es/comprar/vivienda/alella/calefaccion-parking-jardin-terraza-trastero-piscina/164324318/d?from=list

Goes to a URL "https://www.fotocasa.es/es/comprar/viviendas/alella/todas-las-zonas/l?propertyNotFound". So, if it sees propertyNotFound then skip and go to the next link.

1

1 Answers

2
votes

We can use any with mget (mget is used if there are many 'linkToStopAt' objects created in the global env, thus we use it to compactly return the values in a list, then unlist to a vector) and check if there any from 'linkToStopAt' is %in% 'listOfLink'

if(any(unlist(mget(ls(pattern = "linkToStopAt"))) %in% listOfLink)) {

}

If it is from a data.frame column

if(any(as.character(unique(currentData$linkURL)[1:10]) %in% listOfLink)) {
}