1
votes

Version of Helm and Kubernetes:

Kubernetes (GKE): Client Version: v1.9.7 Server Version: 1.10.6-gke.2

Helm: 2.10.0 Server: v2.10.0

Which chart:

stable/mongodb-replicaset

What happened:

Summary: 1st Pod started correct. 2nd Pod is stuck at "Init:2/3"

Details: I wanna setup a mongodb replicaset with 3 replicas. I want to use authentication and TLS with X509 for authentication. Here are the contents of my values.yaml file used for helm install:

replicas: 3
port: 27018

replicaSetName: rs0

podDisruptionBudget: {}
  # maxUnavailable: 1
  # minAvailable: 2

auth:
  enabled: true
  adminUser: admin
  adminPassword: pass1234
  metricsUser: metrics
  metricsPassword: pass1234
  key: abcdefghijklmnopqrstuvwxyz1234567890
  #existingKeySecret:
  #existingAdminSecret:
  #exisitingMetricsSecret:

# Specs for the Docker image for the init container that establishes the replica set
installImage:
  repository: k8s.gcr.io/mongodb-install
  tag: 0.6

  pullPolicy: IfNotPresent

# Specs for the MongoDB image
image:
  repository: mongo
  #tag: 3.6
  tag: latest
  pullPolicy: IfNotPresent

# Additional environment variables to be set in the container
extraVars: {}
# - name: TCMALLOC_AGGRESSIVE_DECOMMIT
#   value: "true"

# Prometheus Metrics Exporter
metrics:
  enabled: false
  image:
    repository: ssalaues/mongodb-exporter
    tag: 0.6.1
    pullPolicy: IfNotPresent
  port: 9216
  path: "/metrics"
  socketTimeout: 3s
  syncTimeout: 1m
  prometheusServiceDiscovery: true
  resources: {}

# Annotations to be added to MongoDB pods
podAnnotations: {}

securityContext:
  runAsUser: 999
  fsGroup: 999
  runAsNonRoot: true

resources:
  limits:
  #   cpu: 100m
    memory: 512Mi
  requests:
#   cpu: 100m
    memory: 256Mi

## Node selector
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
nodeSelector: {}

affinity: {}

tolerations: []

extraLabels: {}

persistentVolume:
  enabled: true
  ## mongodb-replicaset data Persistent Volume Storage Class
  ## If defined, storageClassName: <storageClass>
  ## If set to "-", storageClassName: "", which disables dynamic provisioning
  ## If undefined (the default) or set to null, no storageClassName spec is
  ##   set, choosing the default provisioner.  (gp2 on AWS, standard on
  ##   GKE, AWS & OpenStack)
  ##
  storageClass: "standard"
  accessModes:
    - ReadWriteOnce
  size: 10Gi
  annotations: {}

# Annotations to be added to the service
serviceAnnotations: {}

tls:
  # Enable or disable MongoDB TLS support
  enabled: true
  # Please generate your own TLS CA by generating it via:
  # $ openssl genrsa -out ca.key 2048
  # $ openssl req -x509 -new -nodes -key ca.key -days 10000 -out ca.crt -subj "/CN=mydomain.com"
  # After that you can base64 encode it and paste it here:
  # $ cat ca.key | base64 -w0
  cacert: base64 encoded ca certificate goes here
  cakey: base64 encoded ca key goes here

# Entries for the MongoDB config file
configmap:
  storage:
    dbPath: /data/db
  net:
    port: 27018
    ssl:
      mode: requireSSL
      CAFile: /data/configdb/tls.crt
      PEMKeyFile: /work-dir/mongo.pem
  replication:
    replSetName: rs0
  security:
    authorization: enabled
    clusterAuthMode: x509
    keyFile: /data/configdb/key.txt

# Readiness probe
readinessProbe:
  initialDelaySeconds: 5
  timeoutSeconds: 1
  failureThreshold: 3
  periodSeconds: 10
  successThreshold: 1

# Liveness probe
livenessProbe:
  initialDelaySeconds: 30
  timeoutSeconds: 5
  failureThreshold: 3
  periodSeconds: 10
  successThreshold: 1

I've installed the chart using:

helm install --name mongo-test -f values.yaml stable/mongodb-replicaset

helm installs first without any problems - no error messages during install:

NAME:   mongo-test
LAST DEPLOYED: Wed Aug 29 16:40:43 2018
NAMESPACE: b2c
STATUS: DEPLOYED

RESOURCES:
==> v1/Secret
NAME                                   TYPE               DATA  AGE
mongo-test-mongodb-replicaset-admin    Opaque             2     0s
mongo-test-mongodb-replicaset-ca       kubernetes.io/tls  2     0s
mongo-test-mongodb-replicaset-keyfile  Opaque             1     0s

==> v1/ConfigMap
NAME                                   DATA  AGE
mongo-test-mongodb-replicaset-init     1     0s
mongo-test-mongodb-replicaset-mongodb  1     0s
mongo-test-mongodb-replicaset-tests    1     0s

==> v1/Service
NAME                           TYPE       CLUSTER-IP  EXTERNAL-IP  PORT(S)    AGE
mongo-test-mongodb-replicaset  ClusterIP  None        <none>       27018/TCP  0s

==> v1beta2/StatefulSet
NAME                           DESIRED  CURRENT  AGE
mongo-test-mongodb-replicaset  3        1        0s

==> v1/Pod(related)
NAME                             READY  STATUS   RESTARTS  AGE
mongo-test-mongodb-replicaset-0  0/1    Pending  0         0s

The 1st Pod is then started correctly and without any problems. 2nd Pod is stuck at "Init:2/3"

NAME                                          READY     STATUS     RESTARTS   AGE
po/mongo-test-mongodb-replicaset-0            1/1       Running    0          5m
po/mongo-test-mongodb-replicaset-1            0/1       Init:2/3   0          5m

when I connect to mongo-test-mongodb-replicaset-1 -c bootstrap I can find the following inside /work-dir/log.txt:

mongodb@mongo-test-mongodb-replicaset-1:/work-dir$ more log.txt
[2018-08-29T14:41:51,684293796+00:00] [on-start.sh] Bootstrapping MongoDB replica set member: mongo-test-mongodb-replicaset-1
[2018-08-29T14:41:51,687394595+00:00] [on-start.sh] Reading standard input...
[2018-08-29T14:41:51,688594499+00:00] [on-start.sh] Generating certificate
[2018-08-29T14:41:51,951181683+00:00] [on-start.sh] Peers: mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local
[2018-08-29T14:41:51,952080311+00:00] [on-start.sh] Starting a MongoDB instance...
[2018-08-29T14:41:51,953075555+00:00] [on-start.sh] Waiting for MongoDB to be ready...
2018-08-29T14:41:52.020+0000 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten] MongoDB starting : pid=30 port=27017 dbpath=/data/db 64-bit host=mongo-test-mongodb-replicaset-1
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten] db version v4.0.1
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten] git version: 54f1582fc6eb01de4d4c42f26fc133e623f065fb
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten] modules: none
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten] build environment:
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten]     distmod: ubuntu1604
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten]     distarch: x86_64
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2018-08-29T14:41:52.045+0000 I CONTROL  [initandlisten] options: { config: "/data/configdb/mongod.conf", net: { bindIp: "0.0.0.0", port: 27017, ssl: { CAFile: "/data/configdb/tls.crt", PEMKeyFile: "/work-dir/mongo.pem", mode: "requireSS
L" } }, replication: { replSet: "rs0" }, security: { authorization: "enabled", clusterAuthMode: "x509", keyFile: "/data/configdb/key.txt" }, storage: { dbPath: "/data/db" } }
2018-08-29T14:41:52.047+0000 I STORAGE  [initandlisten]
2018-08-29T14:41:52.047+0000 I STORAGE  [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2018-08-29T14:41:52.047+0000 I STORAGE  [initandlisten] **          See http://dochub.mongodb.org/core/prodnotes-filesystem
2018-08-29T14:41:52.048+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1337M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=jou
rnal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
[2018-08-29T14:41:52,083645436+00:00] [on-start.sh] Retrying...
2018-08-29T14:41:52.789+0000 I STORAGE  [initandlisten] WiredTiger message [1535553712:789699][30:0x7fd33c091a00], txn-recover: Set global recovery timestamp: 0
2018-08-29T14:41:52.800+0000 I RECOVERY [initandlisten] WiredTiger recoveryTimestamp. Ts: Timestamp(0, 0)
2018-08-29T14:41:52.819+0000 I STORAGE  [initandlisten] createCollection: local.startup_log with generated UUID: 2a15f25a-5f7b-47d3-b1a3-2338677428d0
2018-08-29T14:41:52.832+0000 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
2018-08-29T14:41:52.833+0000 I STORAGE  [initandlisten] createCollection: local.replset.oplogTruncateAfterPoint with generated UUID: ae87a21f-d2dc-4474-b4de-d70d95b7a2a8
2018-08-29T14:41:52.847+0000 I STORAGE  [initandlisten] createCollection: local.replset.minvalid with generated UUID: ffaf5c08-356c-4ed0-b4e0-1b3d9cdeea92
2018-08-29T14:41:52.866+0000 I REPL     [initandlisten] Did not find local voted for document at startup.
2018-08-29T14:41:52.866+0000 I REPL     [initandlisten] Did not find local Rollback ID document at startup. Creating one.
2018-08-29T14:41:52.866+0000 I STORAGE  [initandlisten] createCollection: local.system.rollback.id with generated UUID: 6e3e4fc7-b821-4df6-9c32-7db2af4a3bc4
2018-08-29T14:41:52.880+0000 I REPL     [initandlisten] Initialized the rollback ID to 1
2018-08-29T14:41:52.880+0000 I REPL     [initandlisten] Did not find local replica set configuration document at startup;  NoMatchingDocument: Did not find replica set configuration document in local.system.replset
2018-08-29T14:41:52.881+0000 I CONTROL  [LogicalSessionCacheRefresh] Sessions collection is not set up; waiting until next sessions refresh interval: Replication has not yet been configured
2018-08-29T14:41:52.881+0000 I CONTROL  [LogicalSessionCacheReap] Sessions collection is not set up; waiting until next sessions reap interval: Replication has not yet been configured
2018-08-29T14:41:52.881+0000 I NETWORK  [initandlisten] waiting for connections on port 27017 ssl
2018-08-29T14:41:54.148+0000 I NETWORK  [listener] connection accepted from 127.0.0.1:50022 #1 (1 connection now open)
2018-08-29T14:41:54.154+0000 I ACCESS   [conn1] note: no users configured in admin.system.users, allowing localhost access
2018-08-29T14:41:54.154+0000 I NETWORK  [conn1] received client metadata from 127.0.0.1:50022 conn1: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ub
untu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:54.157+0000 I ACCESS   [conn1] Unauthorized: not authorized on admin to execute command { endSessions: [ { id: UUID("95760764-878a-4dd0-8ce6-470182c48a3a") } ], $db: "admin" }
2018-08-29T14:41:54.158+0000 I NETWORK  [conn1] end connection 127.0.0.1:50022 (0 connections now open)
[2018-08-29T14:41:54,162283712+00:00] [on-start.sh] Initialized.
[2018-08-29T14:41:54,267062979+00:00] [on-start.sh] Found master: mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local
[2018-08-29T14:41:54,268004950+00:00] [on-start.sh] Adding myself (mongo-test-mongodb-replicaset-1.mongo-test-mongodb-replicaset.b2c.svc.cluster.local) to replica set...
2018-08-29T14:41:54.368+0000 I NETWORK  [listener] connection accepted from 10.43.67.35:50550 #2 (1 connection now open)
2018-08-29T14:41:54.371+0000 I ACCESS   [conn2]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=mongo-test-mongodb-replicaset-0", $db: "$external" }
2018-08-29T14:41:54.371+0000 I ACCESS   [conn2] Failed to authenticate CN=mongo-test-mongodb-replicaset-0@$external from client 10.43.67.35:50550 with mechanism MONGODB-X509: UserNotFound: Could not find user CN=mongo-test-mongodb-repli
caset-0@$external
2018-08-29T14:41:54.372+0000 I NETWORK  [conn2] end connection 10.43.67.35:50550 (0 connections now open)
2018-08-29T14:41:54.375+0000 I NETWORK  [listener] connection accepted from 10.43.67.35:50552 #3 (1 connection now open)
2018-08-29T14:41:54.378+0000 I NETWORK  [conn3] received client metadata from 10.43.67.35:50552 conn3: { driver: { name: "NetworkInterfaceTL", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16
.04" } }
2018-08-29T14:41:54.379+0000 I ACCESS   [conn3]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=mongo-test-mongodb-replicaset-0", $db: "$external" }
2018-08-29T14:41:54.379+0000 I ACCESS   [conn3] Failed to authenticate CN=mongo-test-mongodb-replicaset-0@$external from client 10.43.67.35:50552 with mechanism MONGODB-X509: UserNotFound: Could not find user CN=mongo-test-mongodb-repli
caset-0@$external
[2018-08-29T14:41:57,388632734+00:00] [on-start.sh] Waiting for replica to reach SECONDARY state...
2018-08-29T14:41:57.441+0000 I NETWORK  [listener] connection accepted from 127.0.0.1:50032 #4 (2 connections now open)
2018-08-29T14:41:57.446+0000 I NETWORK  [conn4] received client metadata from 127.0.0.1:50032 conn4: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ub
untu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:57.448+0000 I ACCESS   [conn4] Supported SASL mechanisms requested for unknown user 'admin@admin'
2018-08-29T14:41:57.448+0000 I ACCESS   [conn4] SASL SCRAM-SHA-1 authentication failed for admin on admin from client 127.0.0.1:50032 ; UserNotFound: Could not find user admin@admin
2018-08-29T14:41:57.450+0000 I ACCESS   [conn4] Unauthorized: not authorized on admin to execute command { endSessions: [ { id: UUID("38098891-2b34-46eb-aef2-53a69416671f") } ], $db: "admin" }
2018-08-29T14:41:57.451+0000 I NETWORK  [conn4] end connection 127.0.0.1:50032 (1 connection now open)
2018-08-29T14:41:58.504+0000 I NETWORK  [listener] connection accepted from 127.0.0.1:50034 #5 (2 connections now open)
2018-08-29T14:41:58.508+0000 I NETWORK  [conn5] received client metadata from 127.0.0.1:50034 conn5: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ub
untu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:58.509+0000 I ACCESS   [conn5] Supported SASL mechanisms requested for unknown user 'admin@admin'
2018-08-29T14:41:58.510+0000 I ACCESS   [conn5] SASL SCRAM-SHA-1 authentication failed for admin on admin from client 127.0.0.1:50034 ; UserNotFound: Could not find user admin@admin
2018-08-29T14:41:58.511+0000 I ACCESS   [conn5] Unauthorized: not authorized on admin to execute command { endSessions: [ { id: UUID("e1089881-4c47-450c-9721-6b291c6f0e50") } ], $db: "admin" }
2018-08-29T14:41:58.512+0000 I NETWORK  [conn5] end connection 127.0.0.1:50034 (1 connection now open)
2018-08-29T14:41:59.574+0000 I NETWORK  [listener] connection accepted from 127.0.0.1:50036 #6 (2 connections now open)
2018-08-29T14:41:59.582+0000 I NETWORK  [conn6] received client metadata from 127.0.0.1:50036 conn6: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ub
untu", architecture: "x86_64", version: "16.04" } }

....to be continued...

3

3 Answers

0
votes

...second part of the OP:

Meanwhile the other - successfully started - pod mongo-test-mongodb-replicaset-0 has this entries in its logfile:

2018-08-29T14:41:22.973+0000 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2018-08-29T14:41:22.980+0000 I CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=mongo-test-mongodb-replicaset-0
2018-08-29T14:41:22.981+0000 I CONTROL  [initandlisten] db version v4.0.1
2018-08-29T14:41:22.981+0000 I CONTROL  [initandlisten] git version: 54f1582fc6eb01de4d4c42f26fc133e623f065fb
2018-08-29T14:41:22.981+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
2018-08-29T14:41:22.981+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2018-08-29T14:41:22.981+0000 I CONTROL  [initandlisten] modules: none
2018-08-29T14:41:22.981+0000 I CONTROL  [initandlisten] build environment:
2018-08-29T14:41:22.981+0000 I CONTROL  [initandlisten]     distmod: ubuntu1604
2018-08-29T14:41:22.981+0000 I CONTROL  [initandlisten]     distarch: x86_64
2018-08-29T14:41:22.981+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2018-08-29T14:41:22.981+0000 I CONTROL  [initandlisten] options: { config: "/data/configdb/mongod.conf", net: { bindIp: "0.0.0.0", port: 27017, ssl: { CAFile: "/data/configdb/tls.crt", PEMKeyFile: "/work-dir/mongo.pem", mode: "requireSSL" } }, replication: { replSet: "rs0" }, security: { authorization: "enabled", clusterAuthMode: "x509", keyFile: "/data/configdb/key.txt" }, storage: { dbPath: "/data/db" } }
2018-08-29T14:41:22.982+0000 I STORAGE  [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2018-08-29T14:41:22.982+0000 I STORAGE  [initandlisten]
2018-08-29T14:41:22.982+0000 I STORAGE  [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2018-08-29T14:41:22.982+0000 I STORAGE  [initandlisten] **          See http://dochub.mongodb.org/core/prodnotes-filesystem
2018-08-29T14:41:22.982+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1337M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2018-08-29T14:41:23.949+0000 I STORAGE  [initandlisten] WiredTiger message [1535553683:949243][1:0x7fee8d75da00], txn-recover: Main recovery loop: starting at 1/49280
2018-08-29T14:41:23.949+0000 I STORAGE  [initandlisten] WiredTiger message [1535553683:949915][1:0x7fee8d75da00], txn-recover: Recovering log 1 through 2
2018-08-29T14:41:24.043+0000 I STORAGE  [initandlisten] WiredTiger message [1535553684:43680][1:0x7fee8d75da00], txn-recover: Recovering log 2 through 2
2018-08-29T14:41:24.119+0000 I STORAGE  [initandlisten] WiredTiger message [1535553684:119960][1:0x7fee8d75da00], txn-recover: Set global recovery timestamp: 5b86b08600000004
2018-08-29T14:41:24.153+0000 I RECOVERY [initandlisten] WiredTiger recoveryTimestamp. Ts: Timestamp(1535553670, 4)
2018-08-29T14:41:24.153+0000 I STORAGE  [initandlisten] Triggering the first stable checkpoint. Initial Data: Timestamp(1535553670, 4) PrevStable: Timestamp(0, 0) CurrStable: Timestamp(1535553670, 4)
2018-08-29T14:41:24.165+0000 I STORAGE  [initandlisten] Starting WiredTigerRecordStoreThread local.oplog.rs
2018-08-29T14:41:24.166+0000 I STORAGE  [initandlisten] The size storer reports that the oplog contains 10 records totaling to 2325 bytes
2018-08-29T14:41:24.166+0000 I STORAGE  [initandlisten] Scanning the oplog to determine where to place markers for truncation
2018-08-29T14:41:24.209+0000 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
2018-08-29T14:41:24.211+0000 I REPL     [initandlisten] Rollback ID is 1
2018-08-29T14:41:24.211+0000 I REPL     [initandlisten] Recovering from stable timestamp: Timestamp(1535553670, 4) (top of oplog: { ts: Timestamp(1535553670, 4), t: 1 }, appliedThrough: { ts: Timestamp(0, 0), t: -1 }, TruncateAfter: Timestamp(0, 0))
2018-08-29T14:41:24.211+0000 I REPL     [initandlisten] Starting recovery oplog application at the stable timestamp: Timestamp(1535553670, 4)
2018-08-29T14:41:24.212+0000 I REPL     [initandlisten] No oplog entries to apply for recovery. Start point is at the top of the oplog.
2018-08-29T14:41:24.212+0000 I REPL     [replexec-0] New replica set config in use: { _id: "rs0", version: 1, protocolVersion: 1, writeConcernMajorityJournalDefault: true, members: [ { _id: 0, host: "mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: -1, catchUpTakeoverDelayMillis: 30000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('5b86b0825d695f321fd3c7e6') } }
2018-08-29T14:41:24.213+0000 I REPL     [replexec-0] This node is mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 in the config
2018-08-29T14:41:24.213+0000 I REPL     [replexec-0] transition to STARTUP2 from STARTUP
2018-08-29T14:41:24.213+0000 I NETWORK  [initandlisten] waiting for connections on port 27017 ssl
2018-08-29T14:41:24.213+0000 I REPL     [replexec-0] Starting replication storage threads
2018-08-29T14:41:24.214+0000 I REPL     [replexec-0] transition to RECOVERING from STARTUP2
2018-08-29T14:41:24.214+0000 I REPL     [replexec-0] Starting replication fetcher thread
2018-08-29T14:41:24.214+0000 I REPL     [replexec-0] Starting replication applier thread
2018-08-29T14:41:24.215+0000 I REPL     [replexec-0] Starting replication reporter thread
2018-08-29T14:41:24.214+0000 I NETWORK  [LogicalSessionCacheRefresh] Starting new replica set monitor for rs0/mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017
2018-08-29T14:41:24.215+0000 I NETWORK  [listener] connection accepted from 10.43.67.35:37744 #2 (1 connection now open)
2018-08-29T14:41:24.216+0000 I REPL     [rsSync-0] Starting oplog application
2018-08-29T14:41:24.216+0000 I REPL     [rsSync-0] transition to SECONDARY from RECOVERING
2018-08-29T14:41:24.216+0000 I REPL     [rsSync-0] conducting a dry run election to see if we could be elected. current term: 1
2018-08-29T14:41:24.216+0000 I REPL     [replexec-0] dry election run succeeded, running for election in term 2
2018-08-29T14:41:24.218+0000 I REPL     [replexec-0] election succeeded, assuming primary role in term 2
2018-08-29T14:41:24.219+0000 I REPL     [replexec-0] transition to PRIMARY from SECONDARY
2018-08-29T14:41:24.219+0000 I REPL     [replexec-0] Entering primary catch-up mode.
2018-08-29T14:41:24.219+0000 I REPL     [replexec-0] Exited primary catch-up mode.
2018-08-29T14:41:24.219+0000 I REPL     [replexec-0] Stopping replication producer
2018-08-29T14:41:24.222+0000 I NETWORK  [conn2] received client metadata from 10.43.67.35:37744 conn2: { driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:24.222+0000 I NETWORK  [LogicalSessionCacheRefresh] Successfully connected to mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 (1 connections now open to mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 with a 5 second timeout)
2018-08-29T14:41:24.222+0000 W NETWORK  [LogicalSessionCacheRefresh] Unable to reach primary for set rs0
2018-08-29T14:41:24.723+0000 W NETWORK  [LogicalSessionCacheReap] Unable to reach primary for set rs0
2018-08-29T14:41:25.224+0000 W NETWORK  [LogicalSessionCacheReap] Unable to reach primary for set rs0
2018-08-29T14:41:25.724+0000 W NETWORK  [LogicalSessionCacheReap] Unable to reach primary for set rs0
2018-08-29T14:41:26.217+0000 I REPL     [rsSync-0] transition to primary complete; database writes are now permitted
2018-08-29T14:41:26.225+0000 I NETWORK  [listener] connection accepted from 10.43.67.35:37748 #4 (2 connections now open)
2018-08-29T14:41:26.226+0000 I NETWORK  [listener] connection accepted from 10.43.67.35:37750 #6 (3 connections now open)
2018-08-29T14:41:26.233+0000 I NETWORK  [conn6] received client metadata from 10.43.67.35:37750 conn6: { driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:26.233+0000 I NETWORK  [LogicalSessionCacheRefresh] Successfully connected to mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 (1 connections now open to mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 with a 0 second timeout)
2018-08-29T14:41:26.233+0000 I NETWORK  [conn4] received client metadata from 10.43.67.35:37748 conn4: { driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:26.234+0000 I NETWORK  [LogicalSessionCacheReap] Successfully connected to mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 (2 connections now open to mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 with a 0 second timeout)
2018-08-29T14:41:26.234+0000 I ACCESS   [conn6]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=mongo-test-mongodb-replicaset-0", $db: "$external" }
2018-08-29T14:41:26.235+0000 I ACCESS   [conn6] Failed to authenticate CN=mongo-test-mongodb-replicaset-0@$external from client 10.43.67.35:37750 with mechanism MONGODB-X509: UserNotFound: Could not find user CN=mongo-test-mongodb-replicaset-0@$external
2018-08-29T14:41:26.235+0000 I NETWORK  [LogicalSessionCacheRefresh] scoped connection to mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 not being returned to the pool
2018-08-29T14:41:26.235+0000 I CONTROL  [LogicalSessionCacheRefresh] Sessions collection is not set up; waiting until next sessions refresh interval: Could not find user CN=mongo-test-mongodb-replicaset-0@$external
2018-08-29T14:41:26.235+0000 I ACCESS   [conn4]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=mongo-test-mongodb-replicaset-0", $db: "$external" }
2018-08-29T14:41:26.236+0000 I ACCESS   [conn4] Failed to authenticate CN=mongo-test-mongodb-replicaset-0@$external from client 10.43.67.35:37748 with mechanism MONGODB-X509: UserNotFound: Could not find user CN=mongo-test-mongodb-replicaset-0@$external
2018-08-29T14:41:26.236+0000 I NETWORK  [LogicalSessionCacheReap] scoped connection to mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 not being returned to the pool
2018-08-29T14:41:26.236+0000 I CONTROL  [LogicalSessionCacheReap] Sessions collection is not set up; waiting until next sessions reap interval: Could not find user CN=mongo-test-mongodb-replicaset-0@$external
2018-08-29T14:41:26.236+0000 I NETWORK  [conn6] end connection 10.43.67.35:37750 (2 connections now open)
2018-08-29T14:41:26.236+0000 I NETWORK  [conn4] end connection 10.43.67.35:37748 (1 connection now open)
2018-08-29T14:41:30.295+0000 I NETWORK  [listener] connection accepted from 127.0.0.1:52056 #7 (2 connections now open)
2018-08-29T14:41:30.300+0000 I NETWORK  [conn7] received client metadata from 127.0.0.1:52056 conn7: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:30.304+0000 I NETWORK  [conn7] end connection 127.0.0.1:52056 (1 connection now open)
2018-08-29T14:41:40.300+0000 I NETWORK  [listener] connection accepted from 127.0.0.1:52072 #8 (2 connections now open)
2018-08-29T14:41:40.304+0000 I NETWORK  [conn8] received client metadata from 127.0.0.1:52072 conn8: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:40.308+0000 I NETWORK  [conn8] end connection 127.0.0.1:52072 (1 connection now open)
2018-08-29T14:41:50.293+0000 I NETWORK  [listener] connection accepted from 127.0.0.1:52090 #9 (2 connections now open)
2018-08-29T14:41:50.297+0000 I NETWORK  [conn9] received client metadata from 127.0.0.1:52090 conn9: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:50.301+0000 I NETWORK  [conn9] end connection 127.0.0.1:52090 (1 connection now open)
2018-08-29T14:41:54.219+0000 I NETWORK  [listener] connection accepted from 10.43.66.21:49738 #10 (2 connections now open)
2018-08-29T14:41:54.224+0000 I NETWORK  [conn10] received client metadata from 10.43.66.21:49738 conn10: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:54.256+0000 I ACCESS   [conn10] Successfully authenticated as principal admin on admin
2018-08-29T14:41:54.261+0000 I NETWORK  [conn10] end connection 10.43.66.21:49738 (1 connection now open)
2018-08-29T14:41:54.325+0000 I NETWORK  [listener] connection accepted from 10.43.66.21:49740 #11 (2 connections now open)
2018-08-29T14:41:54.330+0000 I NETWORK  [conn11] received client metadata from 10.43.66.21:49740 conn11: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:54.362+0000 I ACCESS   [conn11] Successfully authenticated as principal admin on admin
2018-08-29T14:41:54.366+0000 I REPL     [conn11] replSetReconfig admin command received from client; new config: { _id: "rs0", version: 2, protocolVersion: 1, writeConcernMajorityJournalDefault: true, members: [ { _id: 0, host: "mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1.0, host: "mongo-test-mongodb-replicaset-1.mongo-test-mongodb-replicaset.b2c.svc.cluster.local" } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: -1, catchUpTakeoverDelayMillis: 30000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('5b86b0825d695f321fd3c7e6') } }
2018-08-29T14:41:54.372+0000 I NETWORK  [conn11] can't authenticate to mongo-test-mongodb-replicaset-1.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 as internal user, error: Could not find user CN=mongo-test-mongodb-replicaset-0@$external
2018-08-29T14:41:54.372+0000 I REPL     [conn11] replSetReconfig config object with 2 members parses ok
2018-08-29T14:41:54.372+0000 I ASIO     [Replication] Connecting to mongo-test-mongodb-replicaset-1.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017
2018-08-29T14:41:54.380+0000 W REPL     [replexec-1] Got error (Unauthorized: command replSetHeartbeat requires authentication) response on heartbeat request to mongo-test-mongodb-replicaset-1.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017; { ok: 1.0, hbmsg: "" }
2018-08-29T14:41:54.380+0000 E REPL     [conn11] replSetReconfig failed; NodeNotFound: Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: mongo-test-mongodb-replicaset-0.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017; the following nodes did not respond affirmatively: mongo-test-mongodb-replicaset-1.mongo-test-mongodb-replicaset.b2c.svc.cluster.local:27017 failed with command replSetHeartbeat requires authentication
2018-08-29T14:41:54.383+0000 I NETWORK  [conn11] end connection 10.43.66.21:49740 (1 connection now open)
2018-08-29T14:41:58.387+0000 I NETWORK  [listener] connection accepted from 127.0.0.1:52106 #14 (2 connections now open)
2018-08-29T14:41:58.391+0000 I NETWORK  [conn14] received client metadata from 127.0.0.1:52106 conn14: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-08-29T14:41:58.395+0000 I NETWORK  [conn14] end connection 127.0.0.1:52106 (1 connection now open)
2018-08-29T14:42:00.407+0000 I NETWORK  [listener] connection accepted from 127.0.0.1:52118 #15 (2 connections now open)
2018-08-29T14:42:00.416+0000 I NETWORK  [conn15] received client metadata from 127.0.0.1:52118 conn15: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.0.1" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }

from what I understand, everything goes well until the second mongo instance tries to become part of the replica set - when the primary tries to connect to the new cluster member, this connection fails and therefore the whole setup process is stuck

0
votes

Just in case, someone with the same problem stumbles in here: the root cause of the problem is the way the x509 certificates are generates by the helm chart. Basically there are too less information in the cert's subject line to satisfy mongodb's requirements and therefore the cert's can not be used for replicaset member authentication and this leads to the observed errors and stucks the whole setup process.

Good news is, there is a workaround - described over on GitHub: https://github.com/helm/charts/issues/7417#issuecomment-422293057

Has still to be fixed in the helm chart.

0
votes

Configure initContainer by setting volumePermissions.enabled to true.