0
votes

I am trying to search for the right method to submit HTML form with the data from excel cell and retrieve part of the result back.

HTML form URL is http://www2.stat.gov.lt:8777/imones/sektor.html:

<form action="sektor.chk_sekt" method="POST">
<br>
<b>Ūkio subjekto kodas: </b>
<input type="text" name="imone01" size="9" maxlength="9">
<br>
<br>
<input type="submit" value="  OK  ">
</form>

Example data strings to submit are 303305024, 300983557, the value to be extracted from the response on the page http://www2.stat.gov.lt:8777/imones/sektor.chk_sekt is the line:

<BR>
<B>Veiklos rūšis pagal EVRK red. 2: </B>
479100 - Užsakomasis pardavimas paštu arba internetu
<BR>

The values from each cell in column A should be submitted within loop, and the retrieved results should be filled into corresponding cells in column B.

I have reviewed several similar questions but they seem to be using some different format of form and doesn't suit in this case.

1

1 Answers

0
votes

Here is the example using XHR and RegEx to retrieve the data you need:

Option Explicit

Sub RetriveData()

    Dim i As Long

    For i = 1 To Cells.Rows.Count
        If Cells(i, 1).Value = "" Then Exit For
        Cells(i, 2).Value = GetData(Cells(i, 1).Value)
    Next

End Sub

Function GetData(sCompany As String) As String

    Dim sContent As String

    With CreateObject("MSXML2.XMLHTTP")
        .Open "POST", "http://www2.stat.gov.lt:8777/imones/sektor.chk_sekt", False
        .Send "imone01=" & sCompany
        sContent = .ResponseText
    End With
    With CreateObject("VBScript.RegExp")
        .Global = True
        .MultiLine = True
        .IgnoreCase = True
        .Pattern = "<head>[\s\S]*?</head>|(?!<br>)<[^>]*>|[\r\n\t]*"
        sContent = .Replace(sContent, "")
        .Pattern = "<BR>Veiklos r\u016B\u0161is pagal EVRK red. 2: (.*?)<BR>"
        With .Execute(sContent)
            If .Count = 1 Then GetData = .Item(0).SubMatches(0) Else GetData = "n/a"
        End With
    End With

End Function

The output is as follows for me:

output