0
votes

I have the list of links, each link has an id that is in the Id list

How to change the code so that when iterating the link, the corresponding id is substituted into the string: enter image description here

All code is below:

import pandas as pd
from bs4 import BeautifulSoup
import requests

HEADERS = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
                         'Chrome/81.0.4044.138 Safari/537.36 OPR/68.0.3618.125', 'accept': '*/*'}
links = ['https://www..ie', 'https://www..ch', 'https://www..com']
Id = ['164240372761e5178f0488d', '164240372661e5178e1b377', '164240365661e517481a1e6']

def get_html(url, params=None):
    r = requests.get(url, headers=HEADERS, params=params)

def get_data_no_products(html):
    data = []
    soup = BeautifulSoup(html, 'html.parser')
    items = soup.find_all('div', id= '') # How to iteration paste id???????

    for item in items:
        data.append({'pn': item.find('a').get('href')})

    return print(data)

def parse():
    for i in links:
        html = get_html(i)
        get_data_no_products(html.text)
parse()
1
Ctrl-V? What's the problem? Do you want to find divs matching any of those ids? What's stopping you?2e0byo
Yes, I can do Ctrl-V :) but it is only example, I have ~ 1k id, I need automatization this processAleksandr
I missed the word 'corresponding' so my answer (deleted) was irrelevant. You just need zip.2e0byo

1 Answers

0
votes

Parametrise your code:

def get_data_no_products(html, id_):
    data = []
    soup = BeautifulSoup(html, 'html.parser')
    items = soup.find_all('div', id=id_)

And then use zip():

for link, id_ in zip(links, ids):
    get_data_no_producs(link, id_)

Note that there's a likely bug in your code: you return print(data) which will always be none. You likely just want to return data.

PS

There is another solution to this which you will frequently encounter from people beginning in python:

for i in range(len(links)):
    link = links[i]
    id_ = ids[i]
    ...

This... works. It might even be easier or more natural, if you are coming from e.g. C. (Then again I'd likely use pointers...). Style is very much personal, but if you're going to write in a high level language like python you might as well avoid thinking about things like 'the index of the current item' as much as possible. Just my £0.02.