0
votes

Before

Screenshot of Google Sheet before addresses are added

After (Addresses Added)

Screenshot of Google Sheet After addresses are added

I have a spreadsheet with names of venues and I would like to add the addresses of those venues in the next cell using a formula in Google Sheets. I would like to do this using a formula to do a Google Search for the results because I want to scrape many other values as well and quickly change the sheet on my phone as my needs change. Thus I would like to do this without using any outside coding language and not even google apps scripts if I can avoid it.

Google Search for data Google Search for data

I found some information on how to do something similar with an excel formula here:

Using Google Sheets as a basic web scraper

That site shows how to do a google search for an author name of an article with an excel formula with a format like this:

=INDEX(IMPORTXML(A1,"//span[@itemprop='name']"),1)

Can you help me modify this to return the result of a google search into an excel cell?

2

2 Answers

1
votes

The index function could probably be eliminated by someone with more knowledge of XPath, but nonetheless the below seems to work reliably. The following would be pasted into B2 of your sample sheet:

=index(IMPORTXML("http://www.google.com/search?&q="&"Address+of+"&A2&"+in+british+columbia"&"&oq="&"Address+of+"&A2&"+in+british+columbia","//span"), 15, 1)

This is based upon the assumption that all of the venues are located in British Colombia. You could parameterize the Google search section to add a wider array of possible search locations.

0
votes

I had a pretty similar task to achieve recently, and while I wasn't able to apply it in Google sheets - because google is really reluctant to hand over data to scrapers - I did have some success with python. I had to apply a header to trick Google into giving us regular page data so that we can get a uniform class description "sXLaOe" for the queries regarding addresses. Also, I had already made the search queries in excel by appending two different columns together ("Dr." + name + town + "address"), so that google would give us the 'featured snippet' that we could steal.

It's a little slow, so I'm sure someone could make this go a tad quicker, and it requires you to find the class description beforehand, but it's the only solution I could make work for me. Also, sometimes you dont get the suggested address, and you dont want to get ejected out of the for loop, hence the "try" condition.

import requests
from bs4 import BeautifulSoup
import pyexcel as pe


headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36'}
doctorNames = pe.get_array(file_name='../location_here.xlsx') 

for i in doctorNames:
    try:
        url = 'https://google.com/search?q=' + str(i)
        request_result=requests.get(url, headers=headers)
        soup = BeautifulSoup(request_result.text,"lxml")
        mydivs = soup.find("div", {"class": "sXLaOe"}).get_text()
        print(mydivs)
    except:
        print("N/A")

Also, it goes without saying but you'll need to grab yourself the pyexcel and BeautifulSoup modules to make this one run.