0
votes

I have a cloud function that starts a Compute Engine instance. However, when multiple functions are triggered, the previous actions running on the Compute Engine are interrupted by the incoming functions/commands. The Compute Engine is running a pytorch implementation... Would there be a way to send those incoming functions to a queue, so that the current action running on Compute Engine completes before picking up the next incoming action before shutting the machine down? Any conceptual guidance would be greatly appreciated.

EDIT

My function is triggered on changes to a Storage bucket (uploads). In the function, I start a GCE instance and customize its startup behavior with a startup script as follows (some commands and directories are simplified for brevity):

import os
from googleapiclient.discovery import build


def start(event, context):
    file = event
    print(file["id"])

    string = file["id"]

    newstring = string.split('/')
    userId = newstring[1]
    paymentId = newstring[2]
    name = newstring[3]

    print(name)

    if name == "uploadcomplete.txt":
        startup_script = """#! /bin/bash
                cd ~ && pwd 1>>/var/log/log.out 2>&1
                PATH=$PATH://usr/local/cuda 1>>/var/log/log.out 2>&1
                cd program_directory 1>>/var/log/log.out 2>&1
                source /opt/anaconda3/etc/profile.d/conda.sh 1>/var/log/log.out 2>&1
                conda activate env
                cd keras-retinanet/ 1>>/var/log/log.out 2>&1
                export PYTHONPATH=`pwd` 1>>/var/log/log.out 2>&1
                cd tracker 1>>/var/log/log.out 2>&1
                python program_name --gcs_input_path gs://input/{userId}/{paymentId} --gcs_output_path gs://output/{userId}/{paymentId} 1>>/var/log/log.out 2>&1
                sudo python3 gcs_to_mongo.py {userId} {paymentId} 1>>/var/log/log.out 2>&1
                sudo shutdown -P now
                """.format(userId=userId, paymentId=paymentId)
                
        service = build('compute', 'v1', cache_discovery=False)
        print('VM Instance starting')
        project = 'XXXX'
        zone = 'us-east1-c'
        instance = 'YYYY'
        metadata = service.instances().get(project=project, zone=zone, instance=instance)
        metares = metadata.execute()
        print(metares)
    
        fingerprint = metares["metadata"]["fingerprint"]
        print(fingerprint)
        bodydata = {"fingerprint": fingerprint,
                    "items": [{"key": "startup-script", "value": startup_script}]}

        print(bodydata)        
        meta = service.instances().setMetadata(project=project, zone=zone, instance=instance,
                                               body=bodydata)
        res = meta.execute()
        instanceget = service.instances().get(project=project, zone=zone, instance=instance).execute()
        request = service.instances().start(project=project, zone=zone, instance=instance)
        response = request.execute()
        print('VM Instance started')
        print(instanceget)
        print("'New Metadata:", instanceget['metadata'])

The problem occurs when multiple batches are uploaded to Cloud Storage. Each new function will restart the GCE instance with a new startup script and begin work on the new data, leaving the previous data unfinished.

1
Cloud Tasks indeed meet your requirement, Cloud Tasks lets you separate out pieces of work that can be performed independently, outside of your main application flow, and send them off to be processed, asynchronously, using handlers that you create. Let me know if you have more questions about adapt Cloud Tasks into your current system structure,Shawn Di Wu
How many time take your pytorch job on the compute engine?guillaume blaquiere
@ShawnDiWu that sounds like exactly what I need! The link talks moreso about integrating Cloud Tasks with App Engine instead of Compute Engine, but if I can integrate it with Compute Engine, then I will go with itKhari Kisile
The newly created GCE instance is with the same configuration? Any difference?Shawn Di Wu

1 Answers

2
votes

Cloud Tasks indeed meet your requirement, Cloud Tasks lets you separate out pieces of work that can be performed independently, outside of your main application flow, and send them off to be processed, asynchronously, using handlers that you create. Let me know if you have more questions about adapting Cloud Tasks into your current system structure.

Since your GCE instances are created by Cloud Function. Cloud Tasks can call your Cloud Function through its HTTP endpoint with a public IP address. So you Cloud Tasks are HTTP targets. You can follow the Google Cloud Taks Creating HTTP Target tasks for the explanation and sample codes.

Also, GCP provided the tutorial Using Cloud Tasks to trigger Cloud Functions exactly the same system design exactly matches your current requirement.