13
votes

I'm developing a service using Spring and deploying it on OpenShift. Currently I'm using Spring Actuator health endpoint to serve as a liveness and readiness probe for Kubernetes.

However, I will add a call to another service in a Actuator health endpoint, and it looks to me that in that case I need to implement new liveness probe for my service. If I don't do that then a failure in a second service will result with a failure in liveness probe failing and Kubernetes will restart my service without any real need.

Is it OK, for a liveness probe, to implement some simple REST controller which will always return HTTP status 200? If it works, the service can always be considered as alive? Or is there any better way to do it?

4
What do you mean with "I will add a call to another service in a Actuator health endpoint"? Each health endpoint should only provide information about itself, not about other services.user3151902
The case is, if the second service, on which the first one depends, doesn't work, then the first one also doesn't work.dplesa
This is not how it is intended by Kubernetes. As I said, the health/liveness probe should only check the specific service. I agree with so-random-dude's answer here, always returning 200 might mask real errors with the service.user3151902
Liveness will only check for the http status code and not for the actuator's status: "up" .Hannes

4 Answers

25
votes

Liveness Probe

Include only those checks which you think, if fails, will get cured with a pod restart. There is nothing wrong in having a new endpoint that always return an HTTP 200, which will serve as a liveness probe endpoint; provided you have an independent monitoring and alert in place for other services on which your first service depends on.

Where does a simple http 200 liveness helps?

Well, let's consider these examples.

  1. If your application is a one-thread-per-http-request application (servlet based application - like application runs on tomcat - which is spring boot 1.X's default choice), in the case of heavy-load it may become unresponsive. A pod-restart will help here.

  2. If you don't have memory configured while you starts your application; in case of heavy-load, application may outrun the pod's allocated memory and app may become unresponsive. A pod-restart will help here too.

Readiness Probe

There are 2 aspects to it.

1) Let's consider a scenario. Lets say, authentication is enabled on your second service. Your first service (where your health check is) has to be configured properly to authenticate with the second service.

Let's just say, in a subsequent deployment of your 1st service, you screwed up authheader variable name which you were supposed to read from the configmap or secret. And you are doing a rolling update.

If you have the second service's http200 also included in the health check (of the 1st service) then that will prevent the screwed-up version of the deployment from going live; your old version will keep running because your newer version will never make it across the health-check. We may not even need to go that complicated to authentication and all, let's just say url of the second service is hard coded in the first service, and you screwed up that url in a subsequent release of your first service. This additional check in your health-check will prevent the buggy version from going live

2) On the other hand, Let's assume that your first service has numerous other functionalities and this second service being down for a few hours will not affect any significant functionality that first service offers. Then, by all means you can opt out of the second service's liveness from first service's health check.

Either way, you need to set up proper alerting and monitoring for both the services. This will help to decide when humans should intervene.

What I would do is (ignore other irrelevant details),

readinessProbe:
  httpGet:
    path: </Actuator-healthcheck-endpoint>
    port: 8080
  initialDelaySeconds: 120
  timeoutSeconds: 5
livenessProbe:
  httpGet:
    path: </my-custom-endpoint-which-always-returns200>
    port: 8080
  initialDelaySeconds: 130
  timeoutSeconds: 10
  failureThreshold: 10
1
votes

Spring Boot 2.3 has built-in support for Liveness and Readiness Probes:

Spring Boot 2.3 has built-in knowledge of the availability of your application, tracking whether it is alive and whether it is ready to handle traffic.

In a Kubernetes environment, actuator will gather the Liveness and Readiness information and will expose it under health group.

"/actuator/health/liveness"

"/actuator/health/readiness"

For more details please check blog post and documentation

summary for documentation:

The Liveness state of an application tells whether the internal state is valid. If Liveness is broken, this means that the application itself is in a failed state and cannot recover from it. In this case, the best course of action is to restart the application instance. For example, an application relying on a local cache should fail its Liveness state if the local cache is corrupted and cannot be repaired.

The Readiness state tells whether the application is ready to accept client requests. If the Readiness state is unready, Kubernetes should not route traffic to this instance. If an application is too busy processing a task queue, then it could declare itself as busy until its load is manageable again.

0
votes

Rather than create a custom controller for readiness (as suggested in other answers), Spring Boot Actuator has inherent support for this.

Readiness

In Spring Boot 2.0+, it's really easy to implement a readiness probe with Spring Boot Actuator:

@Component
@Endpoint(id = "readiness")
public class ReadinessEndpoint {

    @ReadOperation
    public String getReadiness() {
        // do a custom check for readiness
        if (...) {

        } else {
            throw new RuntimeException("Not ready");
        }
    }
}

This endpoint is then available (by default) at /actuator/readiness. This is configurable.

Liveliness

Liveliness is already available with the Spring Boot Actuator health endpoint located (by default) at /actuator/health. This is configurable.

-2
votes

Although I am replying a bit late. But I think my implementation will help people in future to implement kubernetes readiness / liveness probe for spring boot applications.

Description of my docker image

  1. Built from alpine linux
  2. Spring boot 2 application, can be accessible only via SSL
  3. Spring boot actuator implemented /actuator/health which returns {"status":"UP"}

I created a small shell script "check-if-healthy.sh" as part of docker image to know the status

check-if-healthy.sh
===================
curl -k https://localhost:8888/actuator/health | grep '"status":"UP"' > /dev/null && exit 0 || exit 1

Please note you need to add this script into docker image, so that it will be available in running container and accessible to kubernetes, As kubernetes will fire "docker exec /bin/ash /home/config-server/check-if-healthy." like this

COPY docker/check-if-healthy.sh /home/config-server/check-if-healthy.sh

And then used "exec" option of kubernetes readiness probe to call the script like this.

      readinessProbe:
        exec:
          command:
            - /bin/ash
            - /home/config-server/check-if-healthy.sh
        initialDelaySeconds: 5
        timeoutSeconds: 1
        failureThreshold: 50
      livenessProbe:
        exec:
          command:
            - /bin/ash
            - /home/config-server/check-if-healthy.sh
        initialDelaySeconds: 5
        timeoutSeconds: 1
        failureThreshold: 50