2
votes

Argo-2.11.7 Found a strange behaviour some of the workflow in a namespaces are getting stuck in Running state . After looking closely we could see the problem is with the wait container. what could be the reason the wait container is stuck in Running state ?

logs of wait container

time="2020-12-04T18:30:38.735Z" level=info msg="Executor (version: v2.11.7, build_date: ) initialized "
time="2020-12-04T18:30:38.735Z" level=info msg="Waiting on main container"
time="2020-12-04T18:30:39.436Z" level=info msg="main container started with container ID: bf0d2d493aeaaa4761591aaf8c2c06077804f968a796f693227fa1a619ab438b"
time="2020-12-04T18:30:39.436Z" level=info msg="Starting annotations monitor"
time="2020-12-04T18:30:39.533Z" level=info msg="Starting deadline monitor"
time="2020-12-04T18:30:39.533Z" level=info msg="docker wait bf0d2d493aeaaa4761591aaf8c2c06077804f968a796f693227fa1a619ab438b"
time="2020-12-04T18:30:41.933Z" level=info msg="Main container completed"
time="2020-12-04T18:30:41.933Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2020-12-04T18:30:41.933Z" level=info msg="Capturing script exit code"
time="2020-12-04T18:30:41.933Z" level=info msg="Annotations monitor stopped"
time="2020-12-04T18:30:42.534Z" level=info msg="Deadline monitor stopped"
time="2020-12-04T18:30:44.033Z" level=info msg="No output parameters"
time="2020-12-04T18:30:44.033Z" level=info msg="No output artifacts"
time="2020-12-04T18:30:44.033Z" level=info msg="Annotating pod with output"
time="2020-12-04T18:30:44.134Z" level=info msg="Killing sidecars"
time="2020-12-04T18:30:44.233Z" level=info msg="Alloc=6233 TotalAlloc=15078 Sys=71616 NumGC=5 Goroutines=8"```

wait:
   Container ID:  docker://200b5238fb0f4d015221d0bac7aa2ea79a306d15b2b6ebce247912e38f3eff9d
   Image:         argoproj/argoexec:v2.11.7
   Image ID:      docker-pullable:argoproj/argoexec@sha256:e792274397031569690eb420a6c136d357126640ba535eee553fc4bf82562599
   Port:          <none>
   Host Port:     <none>
   Command:
     argoexec
     wait
   State:          Running
     Started:      
   Ready:          True
   Restart Count:  0


[`enter image description here`][2]


 [1]: https://i.stack.imgur.com/yshSm.png
 [2]: https://i.stack.imgur.com/3XoPZ.png
1

1 Answers

0
votes

it looks like waitcontainer completed as per log 'time="2020-12-04T18:30:44.134Z" level=info msg="Killing sidecars"' is last log in waitcontainer.

Can you create github issue with log and kubectl describe pod <>? https://github.com/argoproj/argo/issues