0
votes

So i can scrape instagram hashtag posts count with the code below.

from selenium import webdriver

driver = webdriver.Firefox()

ig_link = 'https://www.instagram.com/explore/tags/100x35/'

driver.get(ig_link)

# Scrape Posts Count
posts_count = driver.find_element_by_xpath('//*[@id="react-root"]/section/main/header/div[2]/div[1]/div[2]/span/span').text

print(posts_count)
driver.close()

The problem I have is when a hash tag has a flag in it for example:

https://www.instagram.com/explore/tags/100x35????????/

from selenium import webdriver

driver = webdriver.Chrome()

ig_link = 'https://www.instagram.com/explore/tags/100x35????????/'

driver.get(ig_link)

# Scrape Posts Count
posts_count = driver.find_element_by_xpath('//*[@id="react-root"]/section/main/header/div[2]/div[1]/div[2]/span/span').text

print(posts_count)
driver.close()

I get the following error:

Message: no such element: Unable to locate element: {"method":"xpath","selector":"//[@id="react-root"]/section/main/header/div[2]/div[1]/div[2]/span/span"}*

1

1 Answers

1
votes

I believe you need to do encode the URL to get the UTF-8 code for the flag. In this case, you can replace the

🇵🇷

with

%F0%9F%87%B5%F0%9F%87%B7

to get a text-only URL and yield the same results. This tool should be useful if you are going about this manually: link

urllib also has a tool if you want to do it automatically in python

>>> import urllib.parse
>>> query = 'Hellö Wörld@Python'
>>> urllib.parse.quote(query)
'Hell%C3%B6%20W%C3%B6rld%40Python'

More on that here