3
votes

I am currently trying to build a simple file (csv only) processing system in GCP, basically all it does is when a new file is uploaded, the code parses it and stores it in a database.

I created a new topic (file-upload) and a new service account with these commands (following these steps : https://cloud.google.com/run/docs/tutorials/pubsub#integrating-pubsub) :

gcloud projects add-iam-policy-binding my-project \
     --member=serviceAccount:[email protected] \
     --role=roles/iam.serviceAccountTokenCreator
   
gcloud iam service-accounts create cloud-run-pubsub-invoker \
   --display-name "Cloud run PubSub Invoker"
   
gcloud run services add-iam-policy-binding myservice \
   --member=serviceAccount:[email protected] \
   --role=roles/run.invoker

Then created the subscription :

gcloud pubsub subscriptions create file-upload-sub --topic file-upload \
   --push-endpoint=https://myservice-myurl.a.run.app/api/test/ \
   --push-auth-service-account=cloud-run-pubsub-invoker@my-project.iam.gserviceaccount.com

I have a bucket, and I binded an "OBJECT_FINALIZE" event to it with the following command : gsutil notification create -t file-upload -f json -e OBJECT_FINALIZE gs://MyBucket

My endpoint has the following routes (note : it's .net app within a docker image deployed on a cloud run) :

    [Route("api/[controller]")]
    [ApiController]
    public class TestController : ControllerBase
    {
        //GET : api/test
        [HttpGet]
        public IActionResult Get()
        {
            return Ok($"Get route pinged{Environment.NewLine}");
        }

        //POST : api/test
        [HttpPost]
        public IActionResult Post([FromBody] string bodyContent)
        {
            return Ok($"Content received : {bodyContent}{Environment.NewLine}");
        }
    }

When I send a request with curl with the command : curl https://myservice-myurl.a.run.app/api/test/ -H "Authorization: Bearer $(gcloud auth print-identity-token)" -H "Content-Type: application/json" -d '"data"'

Everything is ok (I receive the confirmation in the console and in the log viewer) but when I try and push a message directly to the topic with gcloud pubsub topics publish file-upload --message '"test"', I only get 400 responses in the log viewer with no payload that explains to me where it breaks

EDIT : So apparently it was a problem linked to the way I tried to test the application, for some reason PubSub transforms the way messages are sent to cloud run when they're basic strings so it breaks .net content validation. I changed my code to accept the metadatas of a file in parameters and everything works fine

EDIT 2 : Wrote a proper answer to explain where I was wrong

1
Are you able to view Cloud Monitoring metrics for Pub/Sub (cloud.google.com/pubsub/docs/monitoring) and Cloud Run (cloud.google.com/run/docs/monitoring) to see if your message is successfully being published and passed along?Lauren
What I usually do, I create a pull subscription in addition of the push one. like this, I can pull it (in command line or in the console) and view the message content. Then I copy it and I submit it manually to the my Cloud Run service.guillaume blaquiere
run again the command with "--verbosity=debug" and "--log-http" this will get more info on what is going on.Louis C
Finally found the error, I edited the first post to explain itAdraekor
What do you mean by "PubSub transforms the way messages are sent to cloud run when they're basic strings?"Kamal Aboul-Hosn

1 Answers

0
votes

As pointed in the comments of my original post, I'm writing a proper answer to explain where I was wrong :

PubSub sends a data structure different than curl, so when I tried to read the data I received, cloud run responded with 400 error because the .net json to object validation fails.

At first, I thought pub sub would simply send the metadata file to the cloud run, which is not the case. PubSub sends file information in a structure that looks like this :

    public class Body
    {
        public Message Message { get; set; }

        public Body()
        {
            
        }
    }

    public class Message
    {
        public string MessageId { get; set; }
        public string PublishTime { get; set; }
        public string Data { get; set; }

        public Message()
        {
            
        }

        public Message(string messageId, string publishTime, string data)
        {
            MessageId = messageId;
            PublishTime = publishTime;
            Data = data;
        }
    }

So when I tested my code by sending a basic string with curl, everything worked and when I tried to write a message directly in the topic it didn't. Simply because of the data structure.

I changed my code for this :

        public IActionResult Post([FromBody] Body body)
        {
            var base64EncodedBytes = Convert.FromBase64String(body.Message.Data);
            var readableContent = Encoding.UTF8.GetString(base64EncodedBytes);

            var fileMetadata = JsonConvert.DeserializeObject<FileMetadata>(readableContent);

            var accounts = new List<Account>();

            try
            {
                accounts = CsvParser.Parse(fileMetadata.name);
            }
            catch (Exception e)
            {
                Console.WriteLine(e);
            }

            Console.WriteLine($"{accounts.Count} account(s) detected in csv file");

            return Ok();
        }