1
votes

I'm trying to scrape the number of posts to a given hashtag (#castles) and populate a Google Sheet cell using ImportXML.

I tried copying the Xpath from Chrome and paste it to the ImportXML parameter in the cell like this:

=ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id="react-root"]/section/main/header/div[2]/div/div[2]/span/span")

I saw there is a problem with the quotation marks so I also tried:

=ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id='react-root']/section/main/header/div[2]/div/div[2]/span/span")

Nevertheless, both return an error.

What am I doing wrong?

P.S. I am aware of the Xpath to the meta tag description "//meta[@name='description']/@content" however I would like to scrape the exact number of posts and not an abbreviated number.

1
Would an apps script based solution work or are you looking for it to be accomplished only by using the =IMPORTXML function? I figured out the formula but it doesn't work with the caveat of Result too large =REGEXEXTRACT(ImportXML("https://www.instagram.com/explore/tags/cats/", "//body/script[1]"),"edge_hashtag_to_media[[:punct:]][[:punct:]][[:punct:]][[:punct:]]count[[:punct:]][[:punct:]](\d+)\,[[:punct:]]page_info[[:punct:]]") Edit note: doesn't workSourabh Choraria
I'm curious... how would an apps script function ?Elad Ratson
Here you go - this of course, is just an example implementation - function hashCount() { var url = 'instagram.com/explore/tags/cats'; var response = UrlFetchApp.fetch(url).getContentText(); var regex = /(edge_hashtag_to_media":{"count":)(\d+)(,"page_info":)/gm; var count = regex.exec(response)[2]; Logger.log(count) }Sourabh Choraria
returns an empty value :-(Elad Ratson

1 Answers

1
votes

Try this -

function hashCount() {
  var url = 'instagram.com/explore/tags/cats/';
  var response = UrlFetchApp.fetch(url, {muteHttpExceptions: true}).getContentText();
  var regex = /(edge_hashtag_to_media":{"count":)(\d+)(,"page_info":)/gm;
  var count = regex.exec(response)[2];
  Logger.log(count);
}

Demo -

Instagram Hashtag count

I've added muteHttpExceptions: true which was not added in my comment above. Hope this helps.