How to scrape element if there is some more element with the same tag name and class name but for another for another thing in Beautifulsoup4?

Question

I want to scrape the salary of the job but there are many elements that don't relate to salary have the same tag name and class names how can I scrape it with beautifulsoup4 or I must find another web scraping libraries like selenium. And I think that the xpath will be the same also. How can I scrape the salary only without the another elements about the skills and description

html = '''
<div class="the-same-div">
    <span class="header-span">Salary</span>
    <span class="key-span">
        <span class="css-8888">1000 Dollar</span>
    </span>
</div>
<div class="the-same-div">
    <span class="header-span">Skills</span>
    <span class="key-span">
        <span class="css-8888">Web scraping</span>
    </span>
</div>
<div class="the-same-div">
    <span class="header-span">Description</span>
    <span class="key-span">
        <span class="css-8888">This is a web scraping Job with good salary</span>
    </span>
</div>'''

Now this is the python code to scrape the salary element

from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "lxml")

salary = soup.find_all("span", {"class": "css-8888"})

Now how can I scrape the salary of this job. Thank you.

SlLoWre SlLoWre · Accepted Answer · 2021-06-15T13:44:18

I am not sure that sellenium is good choise for such task, selenium main purpose is a little bit different. To get all salaries i would do in following way:

from bs4 import BeautifulSoup as bs

html_file = open("test.html", "r")

soup = bs(html_file.read())

same_div_list = soup.find_all("div", {"class": "the-same-div"})
jobs_salary_list = []

for div in same_div_list:
    if div.find("span", {"class": "header-span"}).text == "Salary":
        jobs_salary_list.append(div.find("span", {"class": "css-8888"}).text)
print(jobs_salary_list)

So basically bs4 is giving you ability to search locally (inside other objects), so first of all you get all "the-same-div" divs, iterate over them and look in "header-span" values, if it is equal to "Salary" then you take value of "css-8888" span.

How to scrape element if there is some more element with the same tag name and class name but for another for another thing in Beautifulsoup4?

3 Answers