0
votes

I am trying to import, into google sheets, the last quarter's research and development expense for a few thousand companies from their financial statements. While I want to import several different elements from financial statements, the last quarter R&D expense is currently pertinent (and potentially the previous 3 quarters).

I have tried several different sites (yahoo finance, bloomberg, etc) but the simplest URL seems to be from stockrow.com because I can simply automate the substitution of the stock ticker in the URL.

To get the xpath, I inspect the element and copy the xpath using the browser (have tried with Chrome and Firefox).

I am using IMPORTXML on googlesheets and, on my last attempt, used the following input: =IMPORTXML("https://stockrow.com/JNJ/financials/income/quarterly","/html/body/div[1]/div/div/section/div/div[2]/div[1]/section[4]/div/div[3]/div/div/div[3]/div/div/div[11]/div/span")

I have attempted all sorts of combinations of sites, browsers, and xpaths related to the element, but no matter what I do, I always get the same error "Imported content is empty."

I read xpath google sheet importxml but can't make sense of what is happening in the change to the xpath or how to solve this particular challenge.

Because I want this to be repeatable across multiple stock tickers in google sheets, I am hoping that the "location" of the R&D expense (and other elements in the financial statement) are consistent across all pages, rather than just a specific solution to this challenge.

Looking forward to receiving guidance. Thanks!!

1
That page loads information dynamically using javascript and Sheets can't handle that. You'll have to use something like Selenium. Also, did you try EDGAR? - Jack Fleeting
I tried EDGAR but the data is pretty deep in the page and I would not even know where to begin to scrape that information programatically - ACeS
J&J are xbrl filers so that may help you. Even for non-xbrl filers, scraping financial statements through EDGAR is easier, in some respects, than with these other sites. You should look into that if this isn't a one-time project for you. - Jack Fleeting
thanks @JackFleeting! this is a one time project, but i will look into how to extract EDGAR data - ACeS

1 Answers

0
votes

you need some other source. Google Sheets does not support the scraping of JavaScript elements. you can test JS dependency simply by disabling JS for a given site and what's left can be scraped. in your case its nothing:

0