1
votes

I'm struggling with an XPath query inside of a Google Sheets IMPORTXML function. I'm trying to get a table from this site. Using only //tr apparently gives me all table rows from the site. Since I want a specific table I need the table id to drill down the results.

The table id (and class) is as follows:



<table class="leaguetable sortable table detailed-table fixed-wide-table" id="page_competition_1_block_competition_tables_7_block_competition_form_table_1_table">


See my current function and query here.



=IMPORTXML("https://us.soccerway.com/national/england/premier-league/20182019/regular-season/r48730/","//table[@id='page_competition_1_block_competition_tables_7_block_competition_form_table_1_table']/tr")


Using this formula in the spreadsheet gives me #N/A and the error message Imported content is empty.

I tested the query using a Chrome extension and it gave me the desired results. But unfortunately, the table does not show up in my spreadsheet.

1
any reason why not use IMPORTHTML ? - player0
Thanks for your answer! Yes, there's a reason! I tried IMPORTHTML in the first place but I didn't get the desired table. I tried table 1, 2, 3, 4 - but it didn't give me the table I want. Then I went for 5 and I got a #N/A. The table is a bit "hidden", when you visit the page you have to click the button "Form" in the Tables section. The new table will load but the URL will not change. So IMPORTXML was my next option. - Daniel

1 Answers

0
votes

only these tables you can scrape:

=IMPORTHTML("https://us.soccerway.com/national/england/premier-league/20182019/regular-season/r48730/", 
 "TABLE", 2)

anything else is out of limits because it's controlled by JavaScript. you can check so if you click on the lock and enter settings and disable JS and reload the page. then you can observe that clicking on Form button does nothing - eg. can't be scraped.