How to download a csv file using PhantomJS

Question

When I'm browsing a website A using normal browser (Chrome) and when I click on a link on the website A, Chrome imediatelly downloads report in a form of CSV file.

When I checked a server response headers I get the following results:

Cache-Control:private,max-age=31536000
Connection:Keep-Alive
Content-Disposition:attachment; filename="report.csv"
Content-Encoding:gzip
Content-Language:de-DE
Content-Type:text/csv; charset=UTF-8
Date:Wed, 22 Jul 2015 12:44:30 GMT
Expires:Thu, 21 Jul 2016 12:44:30 GMT
Keep-Alive:timeout=15, max=75
Pragma:cache
Server:Apache
Transfer-Encoding:chunked
Vary:Accept-Encoding

Now, I want to download and parse this file using PhantomJS. I set page onResourceReceived listener to see if Phantom will receive/download the file.

clientRequests.phantomPage.onResourceReceived = function(response) {
    console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + JSON.stringify(response));
};

When I make Phantom request to download a file (this is page.open('URL OF THE FILE')), I can see in Phantom log that file is downloaded. Here are logs:

"contentType": "text/csv; charset=UTF-8",
    "headers": {
        "name": "Date",
        "value": "Wed, 22 Jul 2015 12:57:41 GMT"
    },
    "name": "Content-Disposition",
    "value": "attachment; filename=\"report.csv\"",
    "status":200,"statusText":"OK"

I received the file and its content, but how to access file data? When I print current PhantomJS page object, I get the HTML of the page A and I don't want that, I want CSV file, which I need to parse using JavaScript.

possible duplicate of downloading a file that comes as an attachment in a POST request response in PhantomJs — Artjom B.
Wtf man, if Im telling my coworkers to upvote my every post I will have more than 600 points in this few years on StackOverflow and other networks. I was also surprised when I saw 3 upvotes in one hour but that is good not bad. If you investigate this problematic, too much people are fronting the same issue and here I want to see if anyone found a good solution. — MrD
After writing my comment I've looked at your post history and found it unlikely that voting-fraud is at play here. Though, I still find it strange that you received 3 upvotes in less than 10 minutes in such low votes tags such as [phantomjs] and [casperjs]. Might be because of [http], but I somehow doubt it. — Artjom B.
Regarding the duplicate, I grabbed the wrong link, but it still contains a viable answer to your question, but it is wrapped in CasperJS code. I'm talking about page.onFileDownload of the PhantomJS fork. — Artjom B.
After days and days of investigation, this is almost imposible to do with PhantomJS. There are some solutions, but there are not so elegant. After just spending 3 hours on CasperJS I did it, so use CasperJS not only because of this problem, CasperJS is just more intuitive and easier to work with. — MrD

Matthew Lock Matthew Lock · Accepted Answer · 2015-08-10T02:20:52

I found a solution for PhantomJS. Reading through this discussion I found a jsfiddle which downloads a url via jQuery's ajax method and encodes the file as base64.

The file I wanted to download was plain text (CSV) so I have removed the encoding functions. My target page also already had jQuery included so I didn't need to inject jQuery into the target page.

My code assumes you have already opened the page you want to download the file from using PhantomJS, and that page has jQuery in it. In my case I had to first login to the site in order to get the download link.

var fs = require('fs');

var page=this;

var result = page.evaluate(function() {

    var out;
    $.ajax({
        'async' : false,
        'url' : 'fullurltodownload.csv',
        'success' : function(data, status, xhr) {
            out = data;
        }
    });
    return out;

});

fs.write('mydownloadedfile.csv', result);

How to download a csv file using PhantomJS

3 Answers