0
votes

Using Excel VBA, I wish to scrape values from two Drop-down lists. One is filled with states' names, and another with cities.

I can scrape the states names, but when I try to scrape the cities names I get nothing. The cities list is filled accordingly to state selected.

How can I list every city on second drop-down list for each state in the first list?

This query gives me only state names and the default value of the second list:

Sub ScrapDropDown()
    Const URL As String = "http://idebescola.inep.gov.br/ideb/consulta-publica"
    Dim XMLPage As New MSXML2.XMLHTTP60
    Dim HTMLDoc As New MSHTML.HTMLDocument
    XMLPage.Open "GET", URL, False
    XMLPage.send
    HTMLDoc.body.innerHTML = XMLPage.responseText
    Set HTMLDocment = HTMLDoc.getElementById("pkCodEstado")
    For i = 1 To HTMLDocment.Length - 1
        Set HTMLpkCodMunicipio = HTMLDoc.getElementById("pkCodMunicipio")
        For Each HTMLMun In HTMLpkCodMunicipio.getElementsByTagName("option")
            Debug.Print i & "-" & HTMLDocment(i).Value & "-" & HTMLDocment(i).innerText & "-" & HTMLMun.Value & "-" & HTMLMun.innerText
        Next HTMLMun
    Next i
End Sub

Part of HTML with drop list i want to scrape (three dots I removed another unwanted lists), where selected (on site) a state from the first list, without selecting id="pkCodMunicipio" have only one option

<form method="post" name="frm" class="classForm" id="frm">
<label for="pkCodEntidade">Por Código</label>
<div class="divRequired">
</div>
<input name="pkCodEntidade" id="pkCodEntidade" placeholder="Código da Escola" title="Por Código" class="onlynumbers" maxlength="8" tabindex="15" type="text" value="">
<hr>
<label id="lbl">Por área de interesse</label>
<div id="lblDivRequired" class="divRequired" style="display: ;">
</div>
<select name="pkCodEstado" id="pkCodEstado" tabindex="16">
<option value="">UF</option>
<option value="12">ACRE</option>
<option value="27">ALAGOAS</option>
<option value="16">AMAPÁ</option>
<option value="13">AMAZONAS</option>
<option value="29">BAHIA</option>
<option value="23">CEARÁ</option>
<option value="53">DISTRITO FEDERAL</option>
<option value="32">ESPÍRITO SANTO</option>
<option value="52">GOIÁS</option>
<option value="21">MARANHÃO</option>
<option value="51">MATO GROSSO</option>
<option value="50">MATO GROSSO DO SUL</option>
<option value="31">MINAS GERAIS</option>
<option value="15">PARÁ</option>
<option value="25">PARAÍBA</option>
<option value="41">PARANÁ</option>
<option value="26">PERNAMBUCO</option>
<option value="22">PIAUÍ</option>
<option value="33">RIO DE JANEIRO</option>
<option value="24">RIO GRANDE DO NORTE</option>
<option value="43">RIO GRANDE DO SUL</option>
<option value="11">RONDÔNIA</option>
<option value="14">RORAIMA</option>
<option value="42">SANTA CATARINA</option>
<option value="35">SÃO PAULO</option>
<option value="28">SERGIPE</option>
<option value="17">TOCANTINS</option>
</select>
<select name="pkCodMunicipio" id="pkCodMunicipio" tabindex="17">
<option value="">Municípios</option>
<option value="1400050">ALTO ALEGRE</option>
<option value="1400027">AMAJARI</option>
<option value="1400100">BOA VISTA</option>
<option value="1400159">BONFIM</option>
<option value="1400175">CANTA</option>
<option value="1400209">CARACARAI</option>
<option value="1400233">CAROEBE</option>
<option value="1400282">IRACEMA</option>
<option value="1400308">MUCAJAI</option>
<option value="1400407">NORMANDIA</option>
<option value="1400456">PACARAIMA</option>
<option value="1400472">RORAINOPOLIS</option>
<option value="1400506">SAO JOAO DA BALIZA</option>
<option value="1400605">SAO LUIZ</option>
<option value="1400704">UIRAMUTA</option>
</select>
...
<button name="btnSearch" class="btnDefault btn btn-warning" title="Buscar" type="submit" id="btnSearch" onclick="void(0);">Buscar</button>
</div>
<input type="hidden" name="undefined" value="undefined">
</form>
1

1 Answers

0
votes

You can use a css selector combination. The below use an id (#) selector to target the parent select tag element in descendant combination with option element selector to get all the child option tag elements.

Dim nodeList As Object, i As Long
Set nodeList = HTMLDoc.querySelectorAll("#pkCodEstado option")
For i = 0 To nodeList.Length-1
    Debug.Print nodeList.item(i).innerText
Next

i is already declared at the top so you don't actually need to declare again. You should use Option Explicit at the top off all modules and thus declare all your variables. You have a number of undeclared variables in your code.