My Goal
I need to collect all of the video game titles, genre, description, type, and release year on every page.
total_games = 26,215
The "start=9951" changes to "after=WzUuNSwidHQ4NjcxMDM2IiwxMDAwMV0%3D" on the next page iteration
I was originally going to loop: pages = np.arange(1, total_games, 50), every page from 1 to 26215 every 50 entries, but then I stumbled upon this problem.
HTML: < a href="/search/title/?title_type=video_game&sort=user_rating,desc&after=WzUuNSwidHQxODAxMDU0IiwxMDA1MV0%3D&ref_=adv_nxt" class="lister-page-next next-page">Next ยป< /a>
How do I take out a portion of the href link and add to the overall link to loop?
Outcome:
"https://www.imdb.com/search/title/?title_type=video_game&sort=user_rating,desc&" + "after=WzUuNSwidHQ4NjcxMDM2IiwxMDAwMV0%3D" + "&ref_=adv_nxt"
Bold: This is the part of HREF I want to grab on each page to iterate to the next page/This is inside the href that changes.
Any solutions would be greatly appreciated!