2
votes

I need to run some asynchronous tasks in a Django app, and I started to look into Google Cloud Tasks. I think I have followed all the instructions - and every possible variation I could think of, without success so far.

The problem is that all created tasks go to the queue, but fail to execute. The console and the logs report only a http code 301 (permanent redirection). For the sake of simplicity, I deployed the same code to two services of an App Engine (standard), and routed the tasks request to only one of them.

It looks like the code itself is working fine. When I go to "https://[proj].appspot.com/api/v1/tasks", the routine executes nicely and there's no redirection according to DevTools/Network. When Cloud Tasks try to call "/api/v1/tasks", it fails every time.

If anyone could take a look at the code below and point out what may be causing this failure, I'd appreciate very much.

Thank you.

#--------------------------------
# [proj]/.../urls.py
#--------------------------------
from [proj].api import tasks

urlpatterns += [
    # tasks api
    path('api/v1/tasks', tasks, name='tasks'),
]
#--------------------------------
#   [proj]/api.py:
#--------------------------------
from django.views.decorators.csrf import csrf_exempt

@csrf_exempt
def tasks(request):
    print('Start api')
    payload = request.body.decode("utf-8")
    print (payload)
    print('End api')
    return HttpResponse('OK')

#--------------------------------
# [proj]/views/manut.py
#--------------------------------
from django.views.generic import View
from django.shortcuts import redirect
from [proj].tasks import TasksCreate

class ManutView(View):
    template_name = '[proj]/manut.html'

    def post(self, request, *args, **kwargs):
        relative_url = '/api/v1/tasks'
        testa_task = TasksCreate()
        resp = testa_task.send_task(
            url=relative_url,
            schedule_time=5,
            payload={'task_type': 1, 'id': 21}
        )
        print(resp)
        return redirect(request.META['HTTP_REFERER'])

#--------------------------------
# [proj]/tasks/tasks.py:
#--------------------------------
from django.conf import settings
from google.cloud import tasks_v2
from google.protobuf import timestamp_pb2
from typing import Dict, Optional, Union
import json
import time

class TasksCreate:

    def send_task(self,
        url: str,
        payload: Optional[Union[str, Dict]] = None,
        schedule_time: Optional[int] = None,    # in seconds
        name: Optional[str] = None,
        ) -> None:

        client = tasks_v2.CloudTasksClient()
        parent = client.queue_path(
            settings.GCP_PROJECT,
            settings.GCP_LOCATION,
            settings.GCP_QUEUE,
        )

        # App Engine task:
        task = {
            'app_engine_http_request': {  # Specify the type of request.
                'http_method': 'POST',
                'relative_uri': url,
                'app_engine_routing': {'service': 'tasks'}
            }
        }

        if name:
            task['name'] = name
        if isinstance(payload, dict):
            payload = json.dumps(payload)
        if payload is not None:
            converted_payload = payload.encode()
            # task['http_request']['body'] = converted_payload
            task['app_engine_http_request']['body'] = converted_payload
        if schedule_time is not None:
            now = time.time() + schedule_time
            seconds = int(now)
            nanos = int((now - seconds) * 10 ** 9)
            # Create Timestamp protobuf.
            timestamp = timestamp_pb2.Timestamp(seconds=seconds, nanos=nanos)
            # Add the timestamp to the tasks.
            task['schedule_time'] = timestamp

        resp = client.create_task(parent, task)

        return resp

# --------------------------------
# [proj]/dispatch.yaml:
# --------------------------------
dispatch:
  - url: "*/api/v1/tasks"
    service: tasks

  - url: "*/api/v1/tasks/"
    service: tasks

  - url: "*appspot.com/*"
    service: default

#--------------------------------
# [proj]/app.yaml & tasks.yaml:
#--------------------------------
runtime: python37

instance_class: F1

automatic_scaling:
  max_instances: 2

service: default

#handlers:
#- url: .*
#  secure: always
#  redirect_http_response_code: 301
#  script: auto

entrypoint: gunicorn -b :$PORT --chdir src server.wsgi

env_variables:
...

UPDATE:

Here are the logs for an execution:

{
 insertId: "1lfs38fa9"  
 jsonPayload: {
  @type: "type.googleapis.com/google.cloud.tasks.logging.v1.TaskActivityLog"   
  attemptResponseLog: {
   attemptDuration: "0.008005s"    
   dispatchCount: "5"    
   maxAttempts: 0    
   responseCount: "5"    
   retryTime: "2020-03-09T21:50:33.557783Z"    
   scheduleTime: "2020-03-09T21:50:23.548409Z"    
   status: "UNAVAILABLE"    
   targetAddress: "POST /api/v1/tasks"    
   targetType: "APP_ENGINE_HTTP"    
  }
  task: "projects/[proj]/locations/us-central1/queues/tectaq/tasks/09687434589619534431"   
 }
 logName: "projects/[proj]/logs/cloudtasks.googleapis.com%2Ftask_operations_log"  
 receiveTimestamp: "2020-03-09T21:50:24.375681687Z"  
 resource: {
  labels: {
   project_id: "[proj]"    
   queue_id: "tectaq"    
   target_type: "APP_ENGINE_HTTP"    
  }
  type: "cloud_tasks_queue"   
 }
 severity: "ERROR"  
 timestamp: "2020-03-09T21:50:23.557842532Z"  
}
5
I'm not sure what's the purpose of the file views/manut.py as it seems to be not included anywhere. Cloud Tasks Queues have logging disabled by default, so try to enable it for the queue which is dispatching your tasks, maybe it will reveal some info there: gcloud beta tasks queues update <tasks-queue-name> --log-sampling-ratio=1.0 - yedpodtrzitko
Thanks for tip on logging, @yedpodtrzitko. I enabled it, but things didn't get much better, because the err "unavailable" makes no sense to me. - JCampos
Can you remove the rule *appspot.com/* from your dispatch.yaml, just to be sure it's not causing any issues please? Everything is routed to default service by default anyway, so this should be unnecessary there... - yedpodtrzitko
Can you add a simplified example to reproduce? - Emil Gi
I removed *appspot.com/* from dispatch.yaml, but nothong changed. I'll try to make a very simple worker, so the problem can be reproduced, thanks. - JCampos

5 Answers

1
votes

At last I could make Cloud Tasks work, but only using http_request type (with absolute url). There was no way I could make the tasks run when they were defined as app_engine_http_request (relative url).

I had already tried the http_request type with POST, but that was before I exempted the api function from have the csrf token previously checked, and that was causing an error Forbidden (Referer checking failed - no Referer.): /api/v1/tasks, which I failed to connect to the csrf omission.

If someone stumble across this issue in the future, and find out a way to make app_engine_http_request work on Cloud Tasks with Django, I'd still like very much to know the solution.

1
votes

The problem is that App Engine task handlers do not follow redirects, so you have to find out why the request is being redirected and make an exception for App Engine requests. In my case I was redirecting http to https and had to make an exception like so: (Node Express)

app.use((req, res, next) => {

   const protocol = req.headers['x-forwarded-proto']
   const userAgent = req.headers['user-agent']

   if (userAgent && userAgent.includes('AppEngine-Google')) {
      console.log('USER AGENT IS GAE, SKIPPING REDIRECT TO HTTPS.')
      return next()
   } else if (protocol === 'http') {
      res.redirect(301, `https://${req.headers.host}${req.url}`)
   } else {
      next()
   }

})
0
votes

The problem is that all created tasks go to the queue, but fail to execute. The console and the logs report only a http code 301 (permanent redirection).

Maybe the request handler for your task endpoint wants a trailing slash.

Try changing this:

class ManutView(View):
    template_name = '[proj]/manut.html'

    def post(self, request, *args, **kwargs):
        relative_url = '/api/v1/tasks'
        ...

to this:

class ManutView(View):
    template_name = '[proj]/manut.html'

    def post(self, request, *args, **kwargs):
        relative_url = '/api/v1/tasks/'
        ...

Also just try hitting the task url yourself and see if you can get a task to run from curl

0
votes

If someone stumble across this issue in the future, and find out a way to make app_engine_http_request work on Cloud Tasks with Django, I'd still like very much to know the solution.

@JCampos I manage to make it work on my Django app (I use in addition DRF but I do no think it causes a big difference).

from google.cloud import tasks_v2
from google.protobuf import timestamp_pb2
import datetime

class CloudTasksMixin:
@property
def _cloud_task_client(self):
        return tasks_v2.CloudTasksClient()

def send_to_cloud_tasks(self, url, http_method='POST', payload=None,in_seconds=None, name=None):
    """ Send task to be executed """
    parent = self._cloud_task_client.queue_path(settings.TASKS['PROJECT_NAME'], settings.TASKS['QUEUE_REGION'], queue=settings.TASKS['QUEUE_NAME'])

    task = {
        'app_engine_http_request': {
            'http_method': http_method,
            'relative_uri': url
        }
    }

    ...

And then I use a view like this one:

class CloudTaskView(views.APIView):
    authentication_classes = []

    def post(self, request, *args, **kwargs):
        # Do your stuff
        return Response()

Finally I implement this url in the urls.py (from DRF) with csrf_exempt(CloudTaskView.as_view())

At first I had 403 error, but thanks to you and your comment with csrf_exempt, it is now working.

0
votes

It seems that Cloud Tasks calls App Engine using a HTTP url (that's ok because probably they are in the same network), but if you are using HTTPs, Django should be redirecting (http -> https) any request that's being received, including your handler endpoint.

To solve this, you should tell Django to not redirect your handler.
You can use settings.SECURE_REDIRECT_EXEMPT for it.

For instance:

SECURE_REDIRECT_EXEMPT = [r"^api/v1/tasks/$"]