I am trying use BeautifulSoup to build a scraper that will pull box scores off of www.basketball-reference.com. An example box score page would be this. The box score tables that I want are under a table tag have an id that contains the word 'basic' (this distinguishes it from the advanced stats tables). I figured a function would be best for picking out this distinction. Html looks like this.
My code:
r = requests.get(https://www.basketball-reference.com/boxscores/202003110ATL.html).content
soup = BeautifulSoup(r, 'lxml')
def get_boxscore_basic_table(tag):
return ('basic' in tag.attrs['id']) and ('sortable' in tag.attrs['class'])
tables = soup.find_all(get_boxscore_basic_table)
This throws the: "KeyError 'id'" and I am confused on how to fix this. I've checked the keys by grabbing just the first instance using .find():
table = soup.find('table')
print('table.attrs')
And the key 'id' is there. Why can't it locate my specific request when searching through the whole html and how can I fix this?