Skip to content

Throw error when redirecting to authwall #134

@acanimal

Description

@acanimal

It is posible your cookie credentials become invalid and LinkedIn redirects to the "authwall" where you need to login again.

The current code simple returns an empty profile object that generates an error like Cannot read property 'name' of undefined at module.exports (xxx/node_modules/scrapedin/src/profile/cleanProfileData.js:5:23)

At least for me, in that cases, it's necessary to know if the profile has failed due auth error and because of this I have modified slightly the profile.js file with the next lines:

module.exports = async (browser, cookies, url, waitTimeToScrapMs = 500, hasToGetContactInfo = false, puppeteerAuthenticate = undefined) => {
  ...
  const page = await openPage({ browser, cookies, url, puppeteerAuthenticate })

  let authwall = false;
  page.on('response', response => {
    const status = response.status()
    if ((status >= 300) && (status <= 399)) {
      const location = response.headers()['location'];
      if (location.includes('authwall')){
        authwall = true;
      }
    }
  })

  const profilePageIndicatorSelector = '.pv-profile-section'
  await page.waitFor(profilePageIndicatorSelector, { timeout: 5000 })
    .catch(() => {
      //why doesn't throw error instead of continuing scraping?
      //because it can be just a false negative meaning LinkedIn only changed that selector but everything else is fine :)
      logger.warn('profile selector was not found')
    })

  // If redirect to authwall is detected throw error
  if (authwall) {
    const msg = 'Redirected to authwall :( You need new credentials';
    logger.warn(msg);
    throw new Error(msg);
  }

  ...

I don't know if this is something you want to integrate in the project. If so, let me know and I will send a PR.

Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions