Using chrome and xpath in python3, I try to extract the value of an "href" attribute on this web page. "href" attributes contains the link to the movie's trailer ("bande-annonce" in french) I am interested in.
First thing, using xpath, it appears that the "a" tag is a "span" tag. In fact, using this code:
htmlparser = etree.HTMLParser()
tree_main = etree.parse(response_main, htmlparser)
I get this result:
[<Element span at 0x111f70c08>]
So the "div" tag contains no "a" tag but just a "span" tag. I've read that html visualization in browsers doesn't always reflects the "real" html sent by the server. Thus I tried to use this command to extract the href:
htmlparser = etree.HTMLParser()
tree_main = etree.parse(response_main, htmlparser)
Unfortunately, this returns nothing... And when I check the attributes within the "span" tag with this command:
I got the value of the "class" attribute, but nothing about "href"... :
['ACrL3ZACrpZGVvL3BsYXllcl9nZW5fY21lZGlhPTE5NTYwMDcyJmNmaWxtPTIzMTg3NC5odG1s meta-title-link']
I'd like some help to understand what's happening here. Why the "a" tag is a "span" tag? And the most important question to me, how can I extract the value of the "href" attribute?
Thanks a lot for your help!