0
votes

Background

I need to import multiple data values from a URL (always the same data values to import, URL can change - user inserts whatever URL she/he needs).

Problem

  • Multiple =IMPORTXML()s will cause the spreadsheet to get stuck/not import data properly for all cases.
  • I investigated the option to use GAS script that will work similarly to =IMPORTXML() and paste the data to the spreadsheet, but it seems there is no parallel or generic-enough solution for complex XPaths.

Possible Solution

To have the entire html code of a URL as a string in a single cell and then query this string using =REGEXMATCH() to find what I need.

The advantage would be that this can allow me to access the URL only once and then do the checks localy, which will make it much faster.

Expectation

<html>
  <head>
  ... <!-- all children -->
  </head>
  <body>
  ... <!-- all children -->
  </body>
</html>

(in a one/several cells)

What I've tried

I tried using =IMPORTXML(URL,"//html"), but it does not import the actual tags (that are required in this case for context).

Is there another way?

1
Try UrlFetchapp.fetch(url,options) Note: If you have never used this type of command before it can be quite complex so read carefully and some trial and error should be expected. You may have to study the websites api is they have one. Some sites load data after the page is rendered you can find solutions within SO if you search for them - Cooper
Thanks @Cooper, I guess that means that there is no workaround from using GAS. Right now I get what I need with =IMPORTXML() + replacement of the results with their values via GAS, but by trail and error I learned that the latter can't be done in one batch, but rather 1 by one and this is time consuming. Using the UrlFetchApp class will require script.external_request scope, which is also a bit of a problem due to the way my specific case is built. - Yaniv Aflalo
Remember that cell size is limited to 50 000 characters - Krzysztof Dołęgowski
You can display all the URL data using IMPORTDATA("url") In most cases I suggest limiting this using query or array_constrain - Krzysztof Dołęgowski
IMPORTDATA is intended to import CVS alike files, so be prepared to handle content split across columns and rows. - Rubén

1 Answers

2
votes

You can display all the URL data using IMPORTDATA("url") In most cases I suggest limiting this using query or array_constrain as website content may be very long.

Reference: