0
votes

I'm trying to get weather data from this website:

https://www.ilmeteo.it/meteo/Magenta/previsioni-orarie?refresh_ce

with the code:

 try {
                int i = 0;
                if (googlefirst3.startsWith("http")) {
                    Document document = Jsoup.connect("https://www.ilmeteo.it/meteo/Magenta/previsioni-orarie?refresh_ce").userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11 Firefox/19.0").timeout(0).get();
                    Elements temp = document.select("tr");

                    String verifica;
                    verifica=document.html();
                    for (Element movielist : temp) {
                        i++;
                        html = (i + "|||" + movielist.getElementsByTag("td").first().html());
                        array3b[i] = html;

                    }
                }

            } catch (IOException e) {
                e.printStackTrace();}

I'm trying to get the table rows with temperature, wind and time data:

dataimtryingtoget

but I'm unable to get it. The document I get doesn't contain this data and seems to be incomplete. I thought this was due to javascript generated html, but even with this method:

How do I get the web page contents from a WebView?

I was unable to get it. I'm not sure javascript is the issue. Can anybody help me at least trying too identify the problem nature?

Many thanks in advance.

2
I had the same problem, most of the sites nowadays has also a mobile version of their website meaning their html dom is different from their desktop version. What I can recommend you doing is just to print the document/body of the webpage and change your selector accordinglyAnton Makov
I managed to print the entire html code with the webview method, but the code I'm looking for seems to be somehow missing, and this even though the webview actually displays the info.filiking

2 Answers

2
votes

The page you're trying to parse includes content with data using iframe.

<iframe name="frmprevi" id="frmprevi" 
src="https://www.ilmeteo.it/portale/meteo/previsioni1.php?citta=Magenta&amp;c=3749&amp;gm=25" 
width="660" height="600" marginheight="0" marginwidth="0" scrolling="no"
frameborder="0" style="margin:0px;padding:0px"></iframe>

That's why it's not accessible to Jsoup. To get the data you want just parse directly the URL from iframe src: https://www.ilmeteo.it/portale/meteo/previsioni1.php?citta=Magenta&c=3749&gm=25

Now it should be easy, but be aware that the paremeter gm=25 in the URL may represent 25th day of month so you'll have to change it accordingly to get the data for different day.

2
votes

After more digging around there is an iFrame

You can try something like this

Thread(Runnable {

        val document: Document =
            Jsoup.connect("https://www.ilmeteo.it/meteo/Magenta/previsioni-orarie?refresh_ce")
                .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11 Firefox/19.0")
                .timeout(2000).get()




        val body = document.body()
        val table = body.getElementsByClass("datatable")


        val iframe: Element = body.getElementById("frmprevi")
        val iframeSrc: String = iframe.attr("src")

        if (iframeSrc != null) {
            val iframeContentDoc = Jsoup.connect(iframeSrc).get()
            val temps = iframeContentDoc.body().getElementsByClass("boldval")
            for(temp in temps)
            {
                Log.d("temps",temp.text())
            }
        }



    }).start()

It's in kotlin but I think you will understand how to translate it to java and how to get other information from there as well.