When I'm browsing a website A using normal browser (Chrome) and when I click on a link on the website A, Chrome imediatelly downloads report in a form of CSV file.
When I checked a server response headers I get the following results:
Cache-Control:private,max-age=31536000
Connection:Keep-Alive
Content-Disposition:attachment; filename="report.csv"
Content-Encoding:gzip
Content-Language:de-DE
Content-Type:text/csv; charset=UTF-8
Date:Wed, 22 Jul 2015 12:44:30 GMT
Expires:Thu, 21 Jul 2016 12:44:30 GMT
Keep-Alive:timeout=15, max=75
Pragma:cache
Server:Apache
Transfer-Encoding:chunked
Vary:Accept-Encoding
Now, I want to download and parse this file using PhantomJS. I set page
onResourceReceived
listener to see if Phantom will receive/download the file.
clientRequests.phantomPage.onResourceReceived = function(response) {
console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + JSON.stringify(response));
};
When I make Phantom request to download a file (this is page.open('URL OF THE FILE')), I can see in Phantom log that file is downloaded. Here are logs:
"contentType": "text/csv; charset=UTF-8",
"headers": {
"name": "Date",
"value": "Wed, 22 Jul 2015 12:57:41 GMT"
},
"name": "Content-Disposition",
"value": "attachment; filename=\"report.csv\"",
"status":200,"statusText":"OK"
I received the file and its content, but how to access file data? When I print current PhantomJS page
object, I get the HTML of the page A and I don't want that, I want CSV file, which I need to parse using JavaScript.
page.onFileDownload
of the PhantomJS fork. – Artjom B.