1
votes

I am scrapping particular vessel details from MarineTraffic using Node Js. When I try to scrape it shows []. I am very beginner of Node Js and Scrapping. If anyone could help it would be really helpful for me. Here is my code.

const axios = require('axios');
const cheerio = require('cheerio');

const getPostDatas = async () => {
    try {
        const { data } = await axios.get(
            'https://www.marinetraffic.com/en/ais/details/ships/shipid:267686/mmsi:246346000/imo:9423841/vessel:CAPEWATER'
        );
        const $ = cheerio.load(data);
        const postDatas = [];

        
        $("vesselDetails_latestPositionSection > div.MuiCollapse-container.MuiCollapse-entered > div > div > div > div > div.MuiGrid-root.MuiGrid-item.MuiGrid-grid-xs-12.MuiGrid-grid-md-true > p:nth-child(1) > b").each((_idx, el) => {
            const postData = $(el).text()
            postDatas.push(postData)
        });
            
        return postDatas;
    } catch (error) {
        throw error;
    }
};
getPostDatas()
.then((postDatas) => console.log(postDatas));

I followed the tutorial and I don't know how to solve this. If I try scrape element in console its working. But when I input in code it doesn't seems to work.

Probably that giant selector you're looking for isn't being found in the page you got. When I look for just the first part of your selector vesselDetails_latestPositionSection in the HTML that the web page contains, I don't find it.jfriend00
Hmmm, I see that vesselDetails_latestPositionSection does exist in the live DOM of the page, but it does not exist in the HTML source of the page. That means it is being created and inserted with Javascript. You won't be able to get that with cheerio because it doesn't run the Javascript in the page - Cheerio just parses the HTML of the page. You may need to use something like puppeteer that uses the Chromium engine to actually run the Javascript in the page so it can give you access to the whole DOM.jfriend00
<p class="MuiTypography-root MuiTypography-body1 MuiTypography-colorTextPrimary MuiTypography-gutterBottom">Position Received: <b>2021-10-09 16:11 UTC<br>2 minutes ago</b></p>. This is the element I am trying to scrape for IMO number 9423841.Syed
FYI, look at View/Page Source in the browser when viewing that page to see what the plain HTML is that Cheerio sees. Only what you see in View/Page Source can be accessed with Cheerio.jfriend00