6
votes

I thought I had a pretty good catch to find those rare timeouts that I get from puppeteer, but some how this timeout is not caught by any of them - my question is why?

Here is the code:

var readHtml = (url) => {
    return new Promise( async (resolve,reject)=> {

        var browser = await puppeteer.launch()
        var page    = await browser.newPage()

        await page.waitForSelector('.allDataLoaded')

            .then(() => {
                console.log ("Finished reading: " + url)
                return resolve("COOL");
            })

            .catch((err) => {
                console.log ("Timeout or other error: ", err)
                return resolve("TRYAGAIN");
            });
})}

And here is the error....

(node:23124) UnhandledPromiseRejectionWarning: Error: Navigation Timeout Exceeded: 30000ms exceeded at Promise.then 

(node:23124) UnhandledPromiseRejectionWarning: 
Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 2)

I did some research which said it might be because there are some urls not yet finished inside the puppeteer newPage()

But how come this does not get cough by my .catch?

I need it to "TRYAGAIN" in case it fails for what ever reason. Now it just stops with the error and does nothing.

2

2 Answers

11
votes

You're properly catching the waitForSelector and its chained promises, but you're not doing the same for the launch and newPage calls - they're not connected to the catch later.

Because async functions automatically return Promises already, you might consider avoiding the Promise constructor entirely:

var readHtml = async (url) => {
  try {
    var browser = await puppeteer.launch()
    var page    = await browser.newPage()
  } catch(e) {
    // handle initialization error
  }

  await page.waitForSelector('.allDataLoaded')
    .then(() => {
    console.log ("Finished reading: " + url)
    return resolve("COOL");
  })
    .catch((err) => {
    console.log ("Timeout or other error: ", err)
    return resolve("TRYAGAIN");
  });
}

Or, you might consider putting the catch in the consumer of readHtml:

var readHtml = async (url) => {
  var browser = await puppeteer.launch()
  var page    = await browser.newPage()
  await page.waitForSelector('.allDataLoaded')
  console.log ("Finished reading: " + url)
};
readHtml(someurl)
  .catch((e) => console.log('err: ' + e));
7
votes

The tip I'll give you is that you can catch errors at each step of Puppeteer since each returns a promise.

So instead of a try / catch block you can, if you feel the need to, do the following:

const browser = await puppeteer
  .launch()
  .catch(function (error) {
    /* Handle error here for Puppeteer launch and return
       expected value for browser if things fail */
    console.log(error);
  });

const page = await browser
  .newPage()
  .catch(function (error) {
    /* Handle error here for browser new page and return
       expected value for page if things fail */
    console.log(error);
  });

This, for me, is a much cleaner way of catching any expected exceptions at each step.