I wrote some php code to download a xml file sent via mail from Google Adwords to put the data into a mysql database.
One of the functions downloads the file to a webspace, but the size of the file differs from the file, I manually downloaded via chrome (1,8MB vs 1,6MB). Visually there is no difference between them.
The file which has been manually downloaded can be processed, but the file, which has been downloaded via curl can't be processed by simplexml.
Here is the code of the download function:
function downloadUrlToFile( $url, $outFileName ) {
if ( is_file( $url ) ) {
copy( $url, $outFileName );
} else {
$options = array(
CURLOPT_FILE => fopen( $outFileName, 'w' ),
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_HEADER => false,
CURLOPT_TIMEOUT => 28800,
CURLOPT_URL => $url,
CURLOPT_HTTPHEADER => array(
'Host hostname',
'User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:42.0) Gecko/20100101 Firefox/42.0',
'Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language en-US,en;q=0.5',
'Accept-Encoding gzip, deflate',
'Connection keep-alive', )
);
$ch = curl_init();
curl_setopt_array( $ch, $options );
curl_exec( $ch );
}
}
Edit:
Here the new code which does not work, too. I added the : to the header. But now the file is empty (2kb).
function downloadUrlToFile( $url, $outFileName ) {
if ( is_file( $url ) ) {
copy( $url, $outFileName );
} else {
$options = array(
CURLOPT_FILE => fopen( $outFileName, 'w' ),
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_HEADER => false,
CURLOPT_TIMEOUT => 28800,
CURLOPT_URL => $url,
CURLOPT_HTTPHEADER => array(
'Host: '.gethostname(),
'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:42.0) Gecko/20100101 Firefox/42.0',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-US,en;q=0.5',
'Accept-Encoding: gzip, deflate',
'Connection: keep-alive', )
);
$ch = curl_init();
curl_setopt_array( $ch, $options );
curl_exec( $ch );
}
}
:? - Álvaro González