python - Extract src attribute from img tag using BeautifulSoup

Question

<div class="someClass">
    <a href="href">
        <img alt="some" src="some"/>
    </a>
</div>

I want to extract the source (i.e. src) attribute from an image (i.e. img) tag using BeautifulSoup. I use bs4 and I cannot use a.attrs['src'] to get the src, but I can get href. What should I do?

Hi, your post is kinda hard to read -- add some punctuation and line-breaks. It would also be helpful to report the exact error message you receive and what you'd expect / want to happen. — patrick
Why would you expect a.attrs['src'] to work? There's no <a> tag with a src attribute in the snippet you've shown. — jwodder
this is also a completely different question than before & the headline makes no sense now. — patrick
@patrick I used regex to get the src .what's the other questions ? — iDelusion

Abu Shoeb Abu Shoeb · Accepted Answer · 2017-11-07T20:15:21

You can use BeautifulSoup to extract src attribute of an html img tag. In my example, the htmlText contains the img tag itself but this can be used for a URL too along with urllib2.

For URLs

from BeautifulSoup import BeautifulSoup as BSHTML
import urllib2
page = urllib2.urlopen('http://www.youtube.com/')
soup = BSHTML(page)
images = soup.findAll('img')
for image in images:
    #print image source
    print image['src']
    #print alternate text
    print image['alt']

For Texts with img tag

from BeautifulSoup import BeautifulSoup as BSHTML
htmlText = """<img src="https://src1.com/" <img src="https://src2.com/" /> """
soup = BSHTML(htmlText)
images = soup.findAll('img')
for image in images:
    print image['src']

python - Extract src attribute from img tag using BeautifulSoup

4 Answers