I noticed something odd about when working with BeautifulSoup and couldn't find any documentation to support this so I wanted to ask over here.
Say we have a tags like these that we have parsed with BS:
<td>Some Table Data</td>
<td></td>
The official documented way to extract the data is soup.string
. However this extracted a NoneType for the second <td>
tag. So I tried soup.text
(because why not?) and it extracted an empty string exactly as I wanted.
However I couldn't find any reference to this in the documentation and am worried that something is a miss. Can anyone let me know if this is acceptable to use or will it cause problems later?
BTW I am scraping table data from a web page and mean to create CSVs from the data so I do actually need empty strings rather than NoneTypes.