2
votes

I have a pod wiht an app server that is connecting to an external database. For redundancy, I want to run multiple pods, so I scaled up the deployment to 3 with a rolingupdate strategy (maxSurge = 1 and maxUnavailable = 1).

Sometimes (most of the times) the pods fail on first create, because I am using liquibase, and all the pods try to lock the database at the same time.

The easiest solution to me seems to have the pods start up sequentially. So startup pod 1, wait for 60 secs and startup pod 2 etc.

Is that a valid solution? How can I achieve that in k8s (v1.14)?

Here's the output of kubectl describe deploy:

Name:                   jx-apollon
Namespace:              jx-staging
CreationTimestamp:      Sun, 27 Oct 2019 21:28:07 +0100
Labels:                 chart=apollon-1.0.348
                        draft=draft-app
                        jenkins.io/chart-release=jx
                        jenkins.io/namespace=jx-staging
                        jenkins.io/version=4
Annotations:            deployment.kubernetes.io/revision: 3
                        jenkins.io/chart: env
                        kubectl.kubernetes.io/last-applied-configuration:
                          {"apiVersion":"extensions/v1beta1","kind":"Deployment","metadata":{"annotations":{"jenkins.io/chart":"env"},"labels":{"chart":"apollon-1.0...
Selector:               app=jx-apollon,draft=draft-app
Replicas:               0 desired | 0 updated | 0 total | 0 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  1 max unavailable, 1 max surge
Pod Template:
  Labels:  app=jx-apollon
           draft=draft-app
  Init Containers:
   postgres-listener-short:
    Image:      alpine
    Port:       <none>
    Host Port:  <none>
    Command:
      sh
      -c
      echo 'Waiting in init container for DB to become available.'; echo $DB; for i in $(seq 1 5); do echo 'nc ping' $i && nc -z -w3 $DB 5432 && echo 'DB is available, continuing now to application initialization.' && exit 0 || sleep 3; done; echo 'DB is not yet available.'; exit 1
    Environment:
      DB:    jx-apollon-postgresql-db-alias
    Mounts:  <none>
   postgres-listener-longer:
    Image:      alpine
    Port:       <none>
    Host Port:  <none>
    Command:
      sh
      -c
      echo 'Waiting in init container for DB to become available.'; echo $DB; for i in $(seq 1 100); do echo 'nc ping' $i && nc -z -w3 $DB 5432 && echo 'DB is available, continuing now to application initialization.' && exit 0 || sleep 3; done; echo 'DB is not yet available.'; exit 1
    Environment:
      DB:    jx-apollon-postgresql-db-alias
    Mounts:  <none>
  Containers:
   apollon:
    Image:       <redacted>
    Ports:       8080/TCP, 8443/TCP
    Host Ports:  0/TCP, 0/TCP
    Limits:
      cpu:     2
      memory:  6Gi
    Requests:
      cpu:      100m
      memory:   3584Mi
    Liveness:   http-get http://:8080/ delay=60s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:8080/ delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      DB:                 jx-apollon-postgresql-db-alias
      POSTGRES_PASSWORD:  <redacted>
      RULES_CLIENT:       demo
      _JAVA_OPTIONS:      -XX:+UseContainerSupport -XX:MaxRAMPercentage=90.0 -XX:+UseG1GC
    Mounts:               <none>
  Volumes:                <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
OldReplicaSets:  <none>
NewReplicaSet:   jx-apollon-95b4c77cb (0/0 replicas created)
Events:          <none>
1
The Statefulset Controller pods are created sequentially, obeying a Ordered Pod Creation. There are many downsides to use Statefulsets for stateless applications, but worth mentioning...Eduardo Baitello
Does one of the pods succeed in getting the lock and completing its initialization, and the remainder eventually get initialized after being in CrashLoopBackOff state for a little bit? That's pretty normal and my first inclination would be to do nothing.David Maze
No the strange thing is that none of the pods get's a succesful lock. So all the pods keep CrashLoopBackOff-ing.Martijn Burger
Use initContainer with script that waits for database to be unlocked. Try something like this: SELECT LOCKED FROM DATABASECHANGELOGLOCK;. Also take a look at liquibase documentation for more information on DATABASECHANGELOGLOCK table.Matt
yes, it is using init containers, but is you can read in k8s documentation you can have several init containers. Just add another one running script that will wait unitl db is unlocked. Or even better approach would be to rewrite your application so it doesn't crash and just waits for its turn.Matt

1 Answers

1
votes

I changed the initContainer part to this code, which waits until it reads a false value from the databasechangeloglock table created by liquibase.

      initContainers:
      - name: postgres-listener
        image: postgres
        env:
        - name: DB
          value: jx-apollon-postgresql-db-alias
        command: ['sh', '-c', '\
         until psql -qtAX -h $DB -d postgres -c \ 
         "select count(locked) from databasechangeloglock where locked = false group by locked"; 
         do echo  waiting for dbchangeloglock of postgres db to be false; sleep 2; done;
         ']