0
votes

I have a problem with my code. There are two files - scraper and parser. There is a very few imports in them but I'm still getting import error.

from parser import parse_price

ImportError: cannot import name parse_price

I suppose it should be caused by circular imports but I can't find any...

I run scraper.py through pycharm. (there is a line print get_price(...)

Can you see the problem?

parser.py:

# -*- coding: utf-8 -*-

"""
Parse prices strings in various formats

e.g: $1,997.00

"""
import decimal
from exceptions import Exception
import re

_CURRENCIES = u'eur;czk;sk;kč'.split(';')
_SIGNS = u'$;€;£'.split(';')

CURRENCY_STRINGS = [x for x in _CURRENCIES]+[x.capitalize() for x in _CURRENCIES]+[x.upper() for x in _CURRENCIES]
CURRENCY_SIGNS = _SIGNS

REMOVE_STRINGS = CURRENCY_STRINGS + CURRENCY_SIGNS


class PriceParseException(Exception):
    pass

def parse_price(text):
    text = text.strip()
    # try:
    non_decimal = re.compile(r'[^\d,.]+')
    text = non_decimal.sub('', text).replace(',','.')
    if not len([x for x in text if x in '123456789']):
        raise Exception
    price = decimal.Decimal(text.strip())
    # except:
    #     raise
    #     raise PriceParseException()

    return price



from tld import get_tld,update_tld_names #TODO: from tld.utils import update_tld_names; update_tld_names() + TABULKU/MODEL TLD_NAMES

def parse_site(url):
    try:
        return get_tld(url)
    except:
        return None

scraper.py:

import requests
from django.utils.datetime_safe import datetime
from lxml import etree
from exceptions import Exception

class ScanFailedException(Exception):
    pass

def _load_root(url):
    r = requests.get(url)
    r.encoding = 'utf-8'
    html = r.content
    return etree.fromstring(html, etree.HTMLParser())

def _get_price(url,xpath):
    root = _load_root(url)
    try:
        return root.xpath(xpath+'/text()')[0]
    except:
        raise ScanFailedException()

def get_price(url,xpath):
    response = {}
    root = _load_root(url)
    try:
        price = ''.join([x for x in root.xpath(xpath+'/text()') if any(y in x for y in '123456789')])
    except:
        response['error'] = 'ScanFailedException'
        return response

    try:
        from parser import parse_price #HERE IS THE PROBLEM
        decimal_price = parse_price(price)
        response['price']=str(decimal_price)
    except:
        raise
        response['error'] = 'PriceParseException'
        return response
    return response



print get_price('some_url',
          '//strong[@class="product-detail-price"]')

def scrape_product(product):
    from main_app.models import Scan
    for occ in product.occurences.all():
        try:
            from parser import parse_price
            url = occ.url
            xpath = occ.xpath
            raw_price = _get_price(url,xpath)

            price = parse_price(raw_price)
            Scan.objects.create(occurence=occ, datetime=datetime.now(), price=price)
        except:
            """Create invalid scan"""
            Scan.objects.create(occurence=occ,datetime=datetime.now(),price=None,valid=False)
1
Returning an object that contains error information is not Pythonic, just let the exception bubble up to somewhere you can handle it properly. Also it doesn't work, as you raise before it has a chance to return. - jonrsharpe
@Milano Which OS are you using? Did you try to change your file name to something else other than parser.py? - ettanany

1 Answers

1
votes

You should not call your file parser.py, because parser is an existing module in Python.

Read through the following to understand better:

>>> import parser
>>> 
>>> dir(parser)
['ASTType', 'ParserError', 'STType', '__copyright__', '__doc__', '__name__', '__package__', '__version__', '_pickler', 'ast2list', 'ast2tuple', 'compileast', 'compilest', 'expr', 'isexpr', 'issuite', 'sequence2ast', 'sequence2st', 'st2list', 'st2tuple', 'suite', 'tuple2ast', 'tuple2st']
>>> 
>>> 'parse_price' in dir(parser)
False

You see, the Python parser module does not have parse_price, then:

>>> from parser import parse_price
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name parse_price