1
votes

I'm unable to use perl module LWP::Parallel::UserAgent for https websites. The following is the code I use:

#!/usr/bin/perl

use LWP::Parallel::UserAgent qw(:CALLBACK);
use HTTP::Request; 

my $BrowserName = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36";

my $pua = LWP::Parallel::UserAgent->new();
$pua->agent( $BrowserName );
$pua->nonblock('true');
$pua->in_order  (1);
$pua->duplicates(0);
$pua->timeout   (10);
$pua->redirect  (2);
$pua->remember_failures ( 1 );

$url = "https://www.squeezemind.it";

my $res = $pua->register( HTTP::Request->new('GET', $url), \&gestione_risposta, 4096 );

my $entries = $pua->wait();

# Leggo le risposte
foreach (keys %$entries) {
  my $res = $entries->{$_}->response;
  print "\n\nAnswer for '",$res->request->url, "' was ", $res->code,": ", $res->message;  
}


sub gestione_risposta {

  my($html_content, $response, $protocol, $entry) = @_;

  if( !$response->is_success || $response->code != 200 ) { return C_ENDCON; }  

  if( length($html_content) ) {    
    # Bla Bla
  }

  return undef; 

}

It works well for http but if you try to change $url with a https website it fails.

For https://www.squeezemind.it:

Error code: 500 Message: Can't locate object method "get_cipher" via package "IO::Socket::INET" at /usr/share/perl5/LWP/Protocol/https.pm line 119

For https://www.stackoverflow.com:

Error code: 402 Message: Unexpected EOF while reading response

The system is up to date. Suggestions?

Thank you!

1
Net::Curl::Multi would be so much faster (but you lose the familiar LWP interface and portability to Windows)ikegami
"The system is up to date." - whatever this exactly means regarding the versions of the software you are using. Please include the versions of LWP, LWP::Protocol::https, IO::Socket::SSL, Net::SSLeay and openssl you are using into the question - see here for a script to do this. Also please run your code with perl -MIO::Socket::SSL=debug9 program.pl and add the output to your question. And please try it also with a simple LWP::UserAgent instead of LWP::Parallel::UserAgent.Steffen Ullrich
My suggestion would be to use a user agent designed for parallelism to begin with, such as Mojo::UserAgent or Net::Async::HTTP.Grinnz
Re "The system is up to date.", That's not true. The call to get_cipher isn't on line 119 of the latest version of LWP::Protocol::httpsikegami
I installed all modules using cpan on an up to date Ubuntu. I don't know if cpan installs not updated modules or so on. These are the installed versions: LWP 6.36 LWP::Protocol::https 6.07 IO::Socket::SSL 2.056 Net::SSLeay 1.84user3817605

1 Answers

2
votes

From your code:

$pua->nonblock('true');

When looking at the code of LWP::Parallel::UserAgent it looks like non-blocking support for HTTPS is totally broken: https support is implement in LWP::Parallel::Protocol::https which derives from LWP::Parallel::Protocol::http for doing the actual connection. The relevant code in sub _connect:

103     unless ($nonblock) {
104       # perform good ol' blocking behavior
105       #
106       # this method inherited from LWP::Protocol::http
107       $socket = $self->_new_socket($host, $port, $timeout);
108       # currently empty function in LWP::Protocol::http
109       # $self->_check_sock($request, $socket);
110     } else {
111       # new non-blocking behavior
...
116       $socket =
117         IO::Socket::INET->new(Proto => 'tcp', # Timeout => $timeout,
118                               $self->_extra_sock_opts ($host, $port));

One can see that for the (default) blocking case the code uses the functionality of LWP::Protocol::http, but for the non-blocking case it directly uses IO::Socket::INET - and not IO::Socket::SSL for HTTPS. But LWP::Protocol::http (which is used later) actually expects a SSL socket and tries to call get_cipher on it. This results in the error you see:

Can't locate object method "get_cipher" via package "IO::Socket::INET" at /usr/share/perl5/LWP/Protocol/https.pm line 119

When not using non-blocking support the code seems to work instead.

As for HTTPS in general in this module see README.SSL:

 ** DISCLAIMER: https support is pretty buggy as of now. i haven't **
 ** had time to test much of it, so feel free to see if it works   **
 ** for you, but don't expect it to :-)  

In other words: you should probably use a different module to get reliable support for HTTPS.