Get company name from a Google Finance page with Python

Question

I would like to print the company name from the Google Finance page, using the div class appbar-snippet-primary. The code I am usng returns none or []. Wasn't able to get to the span tag containing the company name using beautifulsoup.

html = urlopen('https://www.google.com/finance?q=F')
soup = BeautifulSoup(html, "html.parser")
x = soup.find(id='appbar-snippet-primary')
print(x)

Thank you for the explanation. I have updated the code as you suggested and included the stock price, created a loop, then stored the information in a dictionary.

from bs4 import BeautifulSoup
import requests

x = ('F', 'GE', 'GOOGL')
Company = {}

for i in x:
    head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64)  AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"}
    html = requests.get('https://www.google.com/finance?q=%s' % (i) ,   headers=head).content
    soup = BeautifulSoup(html, "html.parser")
    c = soup.find("div", class_="appbar-snippet-primary").text
    p = soup.find('span',class_='pr').span.text
    Company.update({c : p})
for k, v in Company.items():
print('{:<30} {:>8}'.format(k,v))

bakkal bakkal · Accepted Answer · 2016-07-03T17:02:47

It's a class, not an ID

The element you're interested in looks like this

<div class="appbar-snippet-primary">
    <span>Ford Motor Company</span>
</div>

So it's a div with class="appbar-snippet-primary", not id="appbar-snippet-primary" like your code implies.

That value isn't in the raw HTML, it requires JS to execute first

However there is a deeper problem, that div isn't set until the JavaScript on that page runs, so it's not going to be possible to download the raw HTML and run BeautifulSoup on it, because then the JS isn't executed yet.

One of the script tags in that raw HTML contains: var _companyName = 'Ford Motor Company';, so you can grep for that _companyName = if you insist on using the raw HTML.

Use Selenium

You can use Selenium, because it pilots an actual browser and runs the JS, then you can find that element using its class

from __future__ import print_function

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()
driver.get("https://www.google.com/finance?q=F")

div = driver.find_element_by_css_selector('.appbar-snippet-primary')
company_name = div.text
print(company_name)

driver.close()

I get:

Ford Motor Company

Get company name from a Google Finance page with Python

2 Answers

It's a class, not an ID

That value isn't in the raw HTML, it requires JS to execute first

Use Selenium