0
votes
url <- "ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR105/056/SRR10503056/SRR10503056.fastq.gz" 
for (i in 1:20){
  RCurl::getURL(url, ftp.use.epsv = FALSE, dirlistonly = TRUE)
}

Error: Error in function (type, msg, asError = TRUE) : Recv failure: Connection reset by peer

Why does this happen ? Do you need a wait timer between curl calls to avoid the error ? Is it from the server side ?

> sessionInfo()

R version 4.1.0 (2021-05-18) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.2 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=nb_NO.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=nb_NO.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=nb_NO.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=nb_NO.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

RCurl version: 1.98-1.3

1
As noted in this thread: stackoverflow.com/questions/16338668/… The sys.sleep(0.2) does indeed make it more stable, but it even failed up to: sys.time(0.5). At sys.time(1.0), I have not managed to make it crash yet, will test a bigger loop to make sure, if this is the case, I could catch the error, run sys.sleep(1.0) and try again. I would still like some info, if anyone have a more detailed answer.Roler

1 Answers

0
votes

Ok, I have a solution that has not failed for me: I create a try catch with a max attempt iterator, default 5 attempts with a wait time of 1 second, in addition to a general wait time of 0.05 seconds per accepted url request.

Let me know if anyone has a safer idea:

safe.url <- function(url, attempt = 1, max.attempts = 5) {
    tryCatch(
      expr = {
        Sys.sleep(0.05)
        RCurl::getURL(url, ftp.use.epsv = FALSE, dirlistonly = TRUE)
      },
      error = function(e){
        if (attempt >= max.attempts) stop("Server is not responding to download data,
                                      wait 30 seconds and try again!")
        Sys.sleep(1)
        safe.url(url, attempt = attempt + 1, max.attempts = max.attempts)
      })
  }

url <- "ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR105/056/SRR10503056/SRR10503056.fastq.gz" 
for (i in 1:100){
  safe.url(url)
}