0
votes

import libraries

import urllib2
from bs4 import BeautifulSoup

new libraries:

import csv
import requests 
import string

Defining variables:

i = 1
str_i = str(i)
seqPrefix = 'seq_'
seq_1 = str('https://anyaddress.com/')
quote_page = seqPrefix + str_i

#Then, make use of the Python urllib2 to get the HTML page of the url declared.

# query the website and return the html to the variable 'page'
page = urllib2.urlopen(quote_page)  


#Finally, parse the page into BeautifulSoup format so we can use BeautifulSoup to work on it.

# parse the html using beautiful soup and store in variable `soup`
soup = BeautifulSoup(page, 'html.parser')

As a result, all is fine...except that:

ERROR MESSAGE:

page = urllib2.urlopen(quote_page) File "C:\Python27\lib\urllib2.py", line 154, in urlopen return opener.open(url, data, timeout) File "C:\Python27\lib\urllib2.py", line 423, in open protocol = req.get_type() File "C:\Python27\lib\urllib2.py", line 285, in get_type raise ValueError, "unknown url type: %s" % self.__original ValueError: unknown url type: seq_1

Why?

txs.

2

2 Answers

2
votes

You can use the local variable dictionary vars()

page = urllib2.urlopen(vars()[quote_page])

The way you had it it was trying to open the URL using the string "seq_1" as the URL not the value of the seq_1 variable which is a valid URL.

1
votes

Looks like you need to concat seq_1 & str_i

Ex:

seq_1 = str('https://anyaddress.com/')
quote_page = seq_1 + str_i

Output:

https://anyaddress.com/1