1
votes

I have a TIDHttp (Indy) component, retrieving a website, from where I get the html text to parse and with the data gathered display into a delphi form.

The code is working fine, but when more than 10 records are found, the website shows a link to a Next() javascript function that loads the next 10 records and so on.

Is there something that I can do with TIDHttp in order to execute the next() function ?

The code I'm using to retrieve the html text is as follows:

procedure TForm1.ObtemStringsCorreio(aParamEntrada:string; var aRetorno:TStringList);
var
    _ParamList : TStringList;
begin
    _ParamList := TStringList.Create;

    _ParamList.Add('cepEntrada=' + aParamEntrada);
    _ParamList.Add('tipoCep=ALL');
    _ParamList.Add('cepTemp=');
    _ParamList.Add('metodo=buscarCep');

    try
        aRetorno.Text := idhtp1.Post(cEngineCorreios, _ParamList);
        mmo1.Lines.Clear;
        mmo1.Text :=  aRetorno.Text;
    finally
        _ParamList.Free;
    end;
end;
1
It seems like you're trying to scrape a website for content. Are you sure that there are no APIs available that would allow you to access the data directly? Scraping is a method of last resort, generally discouraged by site owners, and it is naturally fragile. - J...
Thank your for the answer. I Already have read those posts. They use non official "Correios" database and "Correios" do not offer webservices. - Leo Bruno
You cannot execute Javascript with Indy. You would have to download and parse the HTML and Javascript to figure out what kind of HTTP request the script will generate when the button is clicked (or, just analyze the HTML/Javascript yourself and then hard-code the behavior into your code), and then you can send that new HTTP request using Indy to retrieve the next batch of data. - Remy Lebeau

1 Answers

3
votes

Indy is a communications library. It does not have any means for client side script execution. You will need to use another library for that.

A headless browser would be the ideal solution. A more heavyweight solution would be to embed a browser in a hidden form, and get it to do the work. You could use TWebBrowser, Chromium, etc. for this purpose.