Update
I am now using this code
from bs4 import BeautifulSoup
import requests
res=requests.get("https://www.ebay.co.uk/sch/i.html?_from=R40&_nkw=Playstation+1&_sacat=0&_pgn=1")
soup=BeautifulSoup(res.text,'html.parser')
for item,price in zip(soup.select('.lvtitle>a'),soup.select('.lvprice.prc >span')):
print(item.text + " : " + price.text.strip())
It outputs the prices and product titles in a really nice, easy to read format but it is outputting it in a different order to how they are displayed on eBay.
The first four outputs the script gives are
(1) SONY PLAYSTATION 1 PS1 CONSOLE / Tested Working & Controller / 3 FREE GAMES : £28.75
(2) Playstation 1 With Games Including Crash : £20.00
(3) Original Sony Playstation 1 Bundle : £29.99
(4) Sony Playstation 1 PS1 Console Bundle Joblot AV TV Lead : £26.99
But the first four items on eBay are
(1) SONY PLAYSTATION 1 PS1 CONSOLE / Tested Working & Controller / 3 FREE GAMES £28.75
(2) Sony Playstation 1 PS1 Console Bundle Joblot AV TV Lead £26.99
(3) Sony Playstation 1 PS1 PSONE Console Bundle & TV AV Lead TESTED WORKING £29.99
(4) NEW LISTING Sony Playstation PS1 Console Boxed, 2 Controllers, 2 Memory Cards, Original Demo £44.99
Original Question
I want a web scraper to find the product names and prices for all 50 products on the page - https://ebay.co.uk/sch/i.html?_from=R40&_nkw=Playstation+1&_sacat=0&_pgn=1
I ran this code -
for post in soup.select("h3"):
print (post)
-and here was the output (there was more output which I have not included).
<h3 class="header">Please enable JavaScript </h3>
<h3>Format</h3>
<h3 class="lvtitle"><a class="vip" href="https://www.ebay.co.uk/itm/SONY-PLAYSTATION-1-PS1-CONSOLE-Tested-Working-Controller-3-FREE-GAMES/303195399469?_trkparms=ispr%3D1&hash=item4697daa52d:g:K4YAAOSwJmVZ3Ly2&enc=AQAEAAACMBPxNw%2BVj6nta7CKEs3N0qWwG%2FRu4GnzgljVwFYrAPzHjWoiQBIVRFaiPx%2BTZTxK4PBmFSLjHJych5RmooPO%2Fk9I2FqbhK%2BiSCw84S6G5mJqoWRKrmMjE24xQXLI5Tq6prSXt%2Fl5%2BXX5BIj4WcnTSRw8zPLA8umy3NNPbVTyoK8Ir4SgF685KWrEZByct3cX%2FNqc5BQAFj8A46XUhzSY5c6E7GenyGTc%2FEQDW5amzX8BGDa7T0srwIlbSRcuyfaQ%2B0SLD7yDUsYuTxD215mWHQ3jGZserqtWLuVuoXoidgYghdc%2F0t1zF8W%2BTfcz9BxPYvkonPcOijxgbVEK9QVdgsAWHkf0Xgbg%2Fy2bfe2AEykNv3gKXGeFt4HUHjWXFmokHvVMEi8x8W0NNos1x%2FEs%2FCWDq5oOKte%2F5eQ0UNX9mSQ%2BFdS5KVwemULfk807XdSPQ8Rt7fWuLyo1r7L8GGKuYDzb7F4UyzwI5Cl5x72C8%2FJuRTurvboTtjX8kZWYSf5WWRZlwXi1EL%2B6K2hE%2FzAKMcMZ8MGjisTFsR%2BWOimlOQeDKp4HFR3sJXEestKuiLVqeXmxoqaa9SWAzyZLvH0r5JUN6rnNSm9UExRp8PyErBnwBfHEVo2G%2F9PfiXtWn2R4GkAm%2FPHmoNI5dhtupubDkXxI9br7BwNkH9pWSquGHJuDAVoASmL0moQcpUugV4esefKd18ts8akZJ%2FF9GeAONB4ddDGNMu%2F210tqZBtccy44&checksum=30319539946988b1b8ad12ae4011b4e5140cdaa5677a" title="Click this link to access SONY PLAYSTATION 1 PS1 CONSOLE / Tested Working & Controller / 3 FREE GAMES">SONY PLAYSTATION 1 PS1 CONSOLE / Tested Working & Controller / 3 FREE GAMES</a>
</h3>
<h3 class="lvtitle"><a class="vip" href="https://www.ebay.co.uk/itm/Playstation-1-With-Games-Including-Crash/303320348335?hash=item469f4d36af:g:6y0AAOSwE91do1~a" title="Click this link to access Playstation 1 With Games Including Crash">Playstation 1 With Games Including Crash</a>
</h3>
<h3 class="lvtitle"><a class="vip" href="https://www.ebay.co.uk/itm/SONY-PLAYSTATION-1-PS1-CONSOLE-Tested-Working-Controller-3-FREE-GAMES/303195399469?hash=item4697daa52d:g:K4YAAOSwJmVZ3Ly2" title="Click this link to access SONY PLAYSTATION 1 PS1 CONSOLE / Tested Working & Controller / 3 FREE GAMES">SONY PLAYSTATION 1 PS1 CONSOLE / Tested Working & Controller / 3 FREE GAMES</a>
</h3>
The code -
title="Click this link to access SONY PLAYSTATION 1 PS1 CONSOLE / Tested Working & Controller / 3 FREE GAMES">SONY PLAYSTATION 1 PS1 CONSOLE / Tested Working & Controller / 3 FREE GAMES</a>
</h3>
-appears twice.
But the two times it appears, the href value is different. On eBay this item appears at the top of the list, so I somehow need to rewrite the code so it keeps the first instance but gets rid of the second instance. I don't really know where to begin with solving the problem, I don't know what experiments I can do.
ebay-api
? If you were using the API you wouldn't need to scrape. – Barmarclass="lvtitle"
when I viewed the source of that URL. I wonder if eBay is returning something different to BS than it does to a browser. – Barmar