How to schedule task to call gRPC method?

votes

I have .Net server running in Google Kubernetes Engine. It is configured to use gRPC through Google Cloud Endpoints. Now I need to schedule task to call my gRPC method once per day.

The first thing I tried was to use Google Cloud Scheduler to call http methods directly. For that I have:

Set up HTTP to gRPC transcoding on my server to call my gRPC method through http.
Created and enabled SSL certificate as described here.
Created service account in IAM & admin console with Service Account Token Creator and Service Account User permissions.
Created Cloud Scheduler job with my url and Auth header as OIDC token and created above service account.

Deployed Google Cloud Endpoints configuration with following parameters (not only them):

authentication:
  providers:
  - id: google_service_account
    issuer: MY_SERVICE_ACCOUNT_EMAIL
    jwks_uri: https://www.googleapis.com/robot/v1/metadata/x509/MY_SERVICE_ACCOUNT_EMAIL
  rules:
  - selector: "*"
    requirements:
      - provider_id: google_service_account

After that when I run scheduler job it returns result "Failed". In logs it writes ERROR with status UNKNOWN.

The second thing I tried was to use Google Cloud Scheduler to publish message in Pub Sub topic with my server as subscriber. Unsuccesfully too because I can't verify ownership of Google Cloud Endpoints domain. I asked regarding question here: How to verify ownership of Google Cloud Endpoints service URL?

Now the question: what is the best way to schedule task that would call gRPC method assuming following environment:

.Net server running on GKE
gRPC
Automated periodical call of that task (I can call manually but it's meaningless)

google-cloud-platformgoogle-kubernetes-enginegoogle-cloud-endpointsgrpcgoogle-cloud-pubsub

2 Answers

votes

So you were able to make a HTTP call manually, but not automatically by Google Cloud Scheduler, is that correct?

If so, check to see if the request reach the Cloud Endpoint Proxy in the cloud console Endpoint Logging, it may give you some hints.

votes

Distributed scheduler more details refer sourcedcode Distributed scheduler

This application can be run on different hosts and offers functionality to schedule execution of arbitrary command at particular time or periodically. There are two ways to communicate with application: gRPC and REST. Remote interfaces are specified in dsched.proto file Corresponding REST API could be also found over there in form of API annotations. We also provide generated Swagger files. To specify task execution timing, we are using notation adopted by cron. Scheduled tasks are stored in file and loaded automatically during startup.

Building

Install gRPC
Install gRPC gateway

To parse crontab statements and schedule task execution, we are using gopkg.in/robfig/cron.v2 library. So it should be installed also: go get -u gopkg.in/robfig/cron.v2. Documentation could be found here

Get dsched package: go get

-u gitlab.com/andreynech/dsched

Now it is possible to run standard go build command in dscheduler and gateway directories to generate binaries for scheduler and REST/JSON API gateway. It might be also helpful to examine our CI configuration file to see how we set up building environment.

Running All the scheduling functionality is implemented by dscheduler executable. So it could be run on system startup or on demand. As described by dscheduler --help, there are two command line parameters:

-i string - File name to store task list (default "/var/run/dscheduler.db")
-p string - Endpoint to listen (default ":50051")

If there is a need to offer REST/JSON API, gateway application located in gateway directory should be run. It could reside on the same host as dscheduler, but typically it would be other host which is accessible over HTTP from outside and at the same way can talk to dscheduler running in internal network. This setup was also the reason to split scheduler and gateway in two executables. gateway is mostly generated application and supports several command-line parameters described by running gateway --help. Important parameter is -sched_endpoint string which is endpoint of Scheduler service (default "localhost:50051"). It specifies the host name and port where dscheduler is listening for requests.

Scheduling tasks (testing) There are three ways to control scheduler server:

Using Go client implemented in cli/ directory Using Python client implemented in py_cli directory Using REST/JSON API gateway and curl

Go and Python clients have similar set of command line parameters.

$ ./cli --help

Usage of cli:

 -a string
        The command to execute at time specified by -c parameter
  -c string
        Statement in crontab format describes when to execute the command
  -e string
        Host:port to connect (default "localhost:50051")
  -l    List scheduled tasks
  -p    Purge all scheduled tasks
  -r int
        Remove the task with specified id from schedule
  -s    Schedule task. -c and -a arguments are required in this case
They are using gRPC protocol to talk to scheduler server. Here are several
example invocations:

$ ./cli -l list currently scheduled tasks

$ ./cli -s -c "@every 0h00m10s" -a "df" schedule df command for
execution every 10 seconds

$ ./cli -s -c "0 30 * * * *" -a "ls -l" schedule ls -l command to
run every 30 minutes

$ ./cli -r 3 remove task with ID 3

$ ./cli -p remove all scheduled tasks

It is also possible to use curl to invoke dscheduler functionality over REST/JSON API gateway. Assuming that dscheduler and gateway applications are running, here are some invocations to list, add and remove scheduling entries from the same host (localhost):

curl 'http://localhost:8080/v1/scheduler/list' list currently scheduled tasks

curl -d '{"id":0, "cron":"@every 0h00m10s", "action":"ls"}' -X POST 'http://localhost:8080/v1/scheduler/add' schedule ls command for execution every 10 seconds

curl -d '{"id":0, "cron":"0 30 * * * *", "action":"ls -l"}' -X POST 'http://localhost:8080/v1/scheduler/add' schedule ls -l to run every 30 minutes

curl -d '{"id":2}' -X POST 'http://localhost:8080/v1/scheduler/remove' remove task with ID 2.

curl -X POST 'http://localhost:8080/v1/scheduler/removeall' remove all scheduled tasks

All changes are automatically saved in file.

Thoughts on scheduler service discovery In large deployment scenarios (like hundreds of hosts) it might be challenging problem to find out all IP addresses and ports where scheduler service is started. It would be pretty easy to add support for Zeroconf (Bonjour/Avahi) technology to simplify service discovery. As alternative, it might be possible to implement something similar to CORBA Naming Service where running services register themself and location of naming service is well known. We decide to collect feedback before deciding for particular service discovery implementation. So your input very welcome!