6
votes

We have a simple task running with django-celery on Heroku. Something like:

@task
Simple_task():
    for line in csv.reader(origin):
        process_line(line)

process_line(line):
    fields = parse_line(line)
    reg = Model1() # Django model
    reg.field1 = fields[0]
    reg.field2 = fields[1]
    reg.field3 = fields[2]
    reg.save()

Where origin is a csv file. When the file is big (more than 50.000 lines), the task takes up all memory, giving R14 errors until being cancelled by the system (at 150% of available memory of 512 MB). The memory is never released and we have to restart the task manually.

Running in a Linux machine or with foremen on the development machine, it completes with no problems (all 170.000 lines). It seems to be leaking memory ONLY on Heroku. By the way, we run with DEBUG=False.

Is something broken with Heroku implementation of celery tasks? Anything we can be missing? This has become a show-stopper on deploying on Heroku.

Any help would be highly appreciated.

2
Just a general debugging suggestion: my guess is that this is unrelated to neither Django nor to Celery. To prove that, I would create a minimal Heroku app (w/o Django, just a plain "main") that does this, and try to run it. If it fails, look at your requirements.txt first and add debug prints later. If it succeeds, start gradually adding the rest of the stuff until you figure it out. Good luck! - Nitzan Shaked
Are you sure it isn't using large amounts of memory locally and you just haven't noticed? - JoshB

2 Answers

0
votes

I agree with JoshB that it seems to take more than 512MB of memory in your case.

  • What if you make task process_line and create queue of them instead of task to handle the whole file. In that case your memory on Heroku won't be overloaded.

  • The other possible solution for you can be new service from Heroku where you can use 1GB RAM on your dynos. Link: 2x dynos beta

0
votes

Django leak memory when DEBUG is set to True because it saves a copy of every SQL statement it has executed.

You can test locally using a virtual machine with the same specifications that your hosting has. Or use ulimit to limit the process memory. This way you can check if locally your code works with only 512MB of RAM.