0
votes

I am trying to import option price from Yahoo Finance into my Google sheet using ImportXML but the actual values for some of the prices received in Google Sheet is very different than what I can see on Yahoo Finance website. I even tried ImportHTML and the result is the same.

Formula used in Google Sheet:

=TRANSPOSE(IMPORTXML( "https://finance.yahoo.com/quote/KL220121C00045000?p=KL220121C00045000" ,"//tr"))

Here's the result in Google Sheet (all red cells are the values that are different): enter image description here

Actual values on Yahoo Finance page: enter image description here

I am totally clueless why this is happening and how to solve it.

1
I thought that this thread might be useful for your situation. stackoverflow.com/q/64437503 Now, when I put =SAMPLE("https://finance.yahoo.com/quote/KL220121C00045000?p=KL220121C00045000") to a cell using this sample script, I could confirm that the value of the site could be retrieved.Tanaike
thanks for the suggestion. That workaround actually works. So, I am at least unblocked. but it is getting under my skin not knowing how the above is giving a completely different (very different) returned values. hopefully, we'll get some clarity on that too.reza
after trying the workaround for a few times, I do see some returned values still being different than the Yahoo webpage. However, this time most of the values are exactly the same and very few numbers are off (and even for those, the difference is not like 200%).reza
Thank you for the additional information.Tanaike

1 Answers

1
votes

@Tanaike's link to ImportXML not producing correct values answers how to workaround the issue using App Scripts.


To answer the question of "why", I believe Yahoo Finance has implemented some sort of user agent detection, such that requests from Google Spreadsheets, or more specifically requests with the user agent Mozilla/5.0 (compatible; GoogleDocs; apps-spreadsheets; +http://docs.google.com) will be served a different (I believe older) version of the data.

When I visit the link https://finance.yahoo.com/quote/KL220121C00045000?p=KL220121C00045000 in the browser, it currently shows

As of 3:17PM EST. Market open

But when I change my user agent to mimic Google sheets, I get

As of 10:43AM EST. Market open.

Which is the same result as IMPORTXML.

I am guessing they implemented this either to reduce fetching from automated spreadsheets, or to discourage people from scraping their sites using Google Sheets.