1
votes

I am integrating Prometheus into my Kubernetes cluster with the helm chart I downloaded from https://github.com/helm/helm. I am using Azure to deploy my AKS if you must know. In each of my pod, the container runs a Docker image which includes the master_server.py script that controls the workflow in my master pod.

I am trying to get some custom metrics off from my master pod via master_server.py with the official Prometheus Python package - https://github.com/prometheus/client_python. My master_server.py looks something like this,

master_server.py (truncated)

import tornado.ioloop
import tornado.options
import tornado.web
import tornado.websocket
import tornado.gen
import tornado.concurrent
import prometheus_client as prom

num_req = prom.Counter('number_of_request_receive_by_master',
                       'number of request receive by master')
num_worker = prom.Gauge('number_of_worker_available',
                        'number of worker available')

def main():
    logging.debug('Starting up server')
.
.
.
if __name__ == "__main__":
    main()
    prom.start_http_server(8081)

I googled a lil and found out that I need to add the annotations to allow Prometheus to scrape the data off my master pod. So in my deployment.yaml file, I added the following snippet to allow Prometheus to scrape data off my master pod.

  template:
    metadata:
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8081'

Still, it didn't work. I cannot see my custom metrics in the Prometheus queries.

The following is my deployment.yaml of the master pod.

Name:                   kaldi-feature-test-master
Namespace:              kaldi-test
CreationTimestamp:      Fri, 10 Jan 2020 01:53:09 +0800
Labels:                 app.kubernetes.io/instance=kaldi-feature-test
                        app.kubernetes.io/managed-by=Tiller
                        app.kubernetes.io/name=kaldi-feature-test-master
                        helm.sh/chart=kaldi-feature-test-0.1.0
Annotations:            deployment.kubernetes.io/revision: 1
Selector:               app.kubernetes.io/instance=kaldi-feature-test,app.kubernetes.io/name=kaldi-feature-test-master
Replicas:               2 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:       app.kubernetes.io/instance=kaldi-feature-test
                app.kubernetes.io/name=kaldi-feature-test-master
  Annotations:  prometheus.io/port: 8081
                prometheus.io/scrape: true
  Containers:
   kaldi-feature-test-master:
    Image:      kalditest.azurecr.io/kalditestscaled:latest
    Port:       8080/TCP
    Host Port:  0/TCP
    Command:
      /home/appuser/opt/tini
      --
      /home/appuser/opt/start_master.sh
    Limits:
      cpu:     2
      memory:  2Gi
    Requests:
      cpu:      2
      memory:   2Gi
    Liveness:   http-get http://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      environment-variables-master-secret  Secret  Optional: false
    Environment:                           <none>
    Mounts:                                <none>
  Volumes:                                 <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   kaldi-feature-test-master-79886c5d76 (2/2 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  15m   deployment-controller  Scaled up replica set kaldi-feature-test-master-79886c5d76 to 2

I checked the Prometheus targets and realised that the connection is refused to my master pods. master pod connection refused

What should I do to let Prometheus scrape the custom metrics from my master pod?

2
Looks like the python client is not running properly. Try to port forward and send the requests directly: kubectl port-forward name_of_your_pod 8081:8081 then access in your browser localhost:8081 . This way you have direct access to see if it's working, so you can debug the python client you used. It should display a web page with some numbers, it will probably say connection refused in your case.Radu Mazilu
yeah I port forwarded but connection is still refused at localhost:8081Wong Seng Wee
We need more details in order to help you. Could you please provide your configs in form of yamls? Or any other info/config that we could use?Wytrzymały Wiktor
added the deployment.yaml for my master podWong Seng Wee
As I see you only expose port 8080 and you want to use the port 8081 to access to get the metrics. How can you do it?Charles Xu

2 Answers

1
votes

From the Python code and the deployment YAML file that you provided as can be seen, the HTTP server listens to the port 8081, but you only exposed the port 8080, not include the port 8081.

So the solution is that you need to expose the port 8081 both in your container kaldi-feature-test-master of the deployment and the service which routes requests to your application of the deployment.

1
votes

Yes I got it working thanks to Charles' comments!

I was running a Tornado web server for my application in the master pod at port 8080 so that might have disrupted the Prometheus HTTP server to scrape the metrics out of the master pod.

In the end, I opened another port at 8081 in my master pod's deployment.yaml like this,

.
.
.
containers:
  - name: master-pod-name
    image: master-pod-image
    ports:
      - name: http
        containerPort: 8080 # this is for my Tornado web server
        protocol: TCP
      - name: prometheus
        containerPort: 8081
.
.
.

Then in my python script running in the master pod, I set the Prometheus server to run at port 8081. Finally it worked - prom.start_http_server(8081)