0
votes

I am trying to scrape some data from yahoo finance, for each stock, I want to get the historical data. Taking the Apple stock. I should go to https://finance.yahoo.com/quote/AAPL/history?p=AAPL and choose "MAX" from "Time Period". so

I believe the script I wrote so far is getting the date element, but somehow clicking on it to be able to choose "MAX" is not working.

here is my whole script:

# using linux here
project_path = os.getcwd()
driver_path = project_path + "/" + "chromedriver"
yahoo_finance = "https://finance.yahoo.com/quote/"
driver = webdriver.Chrome(driver_path)


def get_data(symbol='AAPL'):
    stock_history_link = yahoo_finance + symbol + '/history?p=' + symbol
    driver.get(stock_history_link)
    date_picker = '//div[contains(@class, "D(ib)") and contains(@class, "Pos(r)") and contains(@class, "Cur(p)")' \
                  'and contains(@class, "O(n):f")]'
    try:
        print("I am inside")
        date_picker_2 = "//div[@class='Pos(r) D(ib) O(n):f Cur(p)']"
        date_picker_element = driver.find_element_by_xpath(date_picker_2)
        print("date_picker_element: ", date_picker_element)
        date_picker_element.click()
        try:
            print("I will be waiting for the date")
            my_dropdown = WebDriverWait(driver, 100).until(
                EC.presence_of_element_located((By.ID, 'dropdown-menu'))
            )
            print(my_dropdown)
            print("I am not waiting anymore")
        except TimeoutException as e:
            print("wait timed out")
            print(e)
    except WebDriverException:
        print("Something went wrong while trying to pick the max date")

if __name__ == '__main__':
    try:
        get_data()
    except:
        pass
    # finally:
    #     driver.quit()
2
I'm looking at your code now, but I will say, you can use requests to get the same data in a much easier fashion. If you manually check the url after you click "max," you'll you can just use that url for a requests.get().goalie1998
@goalie1998 nice trick, but since the element cant be shown in the source until you press the text, then this can not be achieved. Indeed, that's why I am using selenium.sin0x1

2 Answers

1
votes

To click the button with Max just open it up and target it.

driver.get("https://finance.yahoo.com/quote/AAPL/history?p=AAPL")
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, "//span[@class='C($linkColor) Fz(14px)']"))).click()
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[@data-value='MAX']"))).click()

Element:

<button class="Py(5px) W(45px) Fz(s) C($tertiaryColor) Cur(p) Bd Bdc($seperatorColor) Bgc($lv4BgColor) Bdc($linkColor):h Bdrs(3px)" data-value="MAX"><span>Max</span></button>

Imports:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC
1
votes

You have the wrong xpath for the date_picker_2:

date_picker_2 = '//*[@id="Col1-1-HistoricalDataTable-Proxy"]/section/div[1]/div[1]/div[1]/div/div/div/span'

Using requests:

import requests
import datetime

end = int(datetime.datetime.strptime(datetime.date.today().isoformat(), "%Y-%m-%d").timestamp())
url = f"https://finance.yahoo.com/quote/AAPL/history?period1=345427200&period2={end}&interval=1d&filter=history&frequency=1d&includeAdjustedClose=true"
requests.get(url)

Gets you to the same end page.