I an using a function (movies_from_url) to read movies total 256 from a webpage. Each page contains 50 movies. I have to read first 6 pages for this (5 pages for 250 movies and 6th page for 6 movies).
first url:
http://www.imdb.com/search/title?at=0&sort=user_rating&start=1&title_type=feature&year=2005,2014
Here is my vague idea:
def read_m_by_rating(first_year=2005, last_year=2015, top_number=256):
current_index=1 # current index is start number of a webpage
final_list = []
for _ in xrange(6):
url = http://www.imdb.com/search/title?at=0&sort=user_rating&start=current_index&title_type=feature&year=2005,2014
if top_number==300:
lis = movies_from_url(url, top_number - current_index + 1)
else:
lis = movies_from_url(url, 50)
final_list.append(lis)
current_index=+50
return final_list
for loop here to create url:
? – ForceBrustart
as this:for o in xrange(20): a_url="http://url.com/?bla=23&start="+str(o)+"&blabla=32"
and usea_url
then – ForceBru