0
votes

I have a docker image, that contains a python file that accepts arguments from command line using sys.stdin(). I can run the image using the following command

cat file.csv | docker run -i -t my_image

It pipes the contents of file.csv to the image, and i get the output as expected.

Now i want to deploy this image to kubernetes. I can run the image on the server using docker without any problems. But if i curl to it, it should send a response back, but i am not getting it because i do not have a web server listening on any port. I went ahead and built a deployment using the following command.

kubectl run -i my_deployment --image=gcr.io/${PROJECT_ID}/my_image:v1 --port 8080

It built the deployment and i can see the pods running. Then i expose it.

kubectl expose deployment my_deployment --type=LoadBalancer --port 80 --target-port 8080

But if i try to access it using the IP assigned using curl,

curl http://allocated_ip

i get a response "connection refused".

How can deploy this docker image as a service on kubernetes and send contents of a file as an input to the service? Do i need a web server for that?

3

3 Answers

1
votes

Kubernetes generally assumes the containers it deploys are long-lived and autonomous. If you're deploying something in a Pod, particularly via a Deployment, it should be able to run on its own without any particular inputs. If it immediately exits, Kubernetes will restart it, and you'll quickly wind up in the dreaded CrashLoopBackOff state.

In short: you need to redesign your container to not use stdin and stdout is its primary interface.

Your instinct to add a network endpoint into the service is probably correct, but Kubernetes won't do that on its own. If you rebuild your application to have, say, a Flask server and listen on a port, that's something you can readily deploy to Kubernetes. If the application expects data to come in on stdin and its results to go to stdout, adding the Kubernetes networking metadata won't help anything: in your example if nothing is listening inside the container on port 8080 then a network connection will never go anywhere.

1
votes

I am assuming Kubernetes is running on premises. I would do the following.

  • Create a nginx or apache deployment. Using Helm, it is pretty easy with

helm install stable/nginx-ingress

  • Create a deployment with the port 8080, or whatever you would expose from running it from with docker. The actual deployment would have an API which I could send content via a POST.

  • Create a service with port 8080 and targetPort 8080. It should be type ClusterIP.

  • Create a ingress with the hostname, and servicePort of 8080.

0
votes

Since you are passing the file as argument when running a command, this makes me think that once you have the content on the container you do not need to update the content of the csv.

The best approach to achieve the read operation of that file, would be to ADD that file on your Dockerfile and the open the file using the open function.

You would have a line like

ADD file.csv /home/file.csv

And in your python code something like :

file_in = open(‘home/file.csv’, ‘r’)

Note that if you want to change the file, you would need to update the Dockerfile, build again, push to the registry and re-deploy to GKE. If you do not want to follow this process, you can use a ConfigMap.

Also, if this answers your question make sure to link your same question on serverfault