-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
None
-
False
-
None
-
False
-
No
-
---
-
---
-
-
We noticed when the autoscaling feature was enabled that openshift ingress replicas were sometimes sticking in the Terminating state. The pods stick in this state until the TerminationGracePeriodSeconds expires (1h). It is not clear if this is a new problem, or something that may have gone unnoticed.
The pattern we see looks like this:
oc get pods
NAME READY STATUS RESTARTS AGE
router-default-695f856d8-dmx7g 1/1 Running 0 158d
router-default-695f856d8-xlkzg 1/1 Running 0 158d
router-kas-694dd4695c-44zzh 1/1 Running 0 17m
router-kas-694dd4695c-4gg7j 1/1 Terminating 0 36m
router-kas-694dd4695c-bx8wh 1/1 Running 0 36m
router-kas-6c56f46db4-72klh 1/1 Terminating 0 36m
router-kas-6c56f46db4-q7kdm 1/1 Terminating 0 36m
router-kas-us-east-1a-6db4684c88-7zs9q 1/1 Running 0 36m
router-kas-us-east-1a-8487dc944c-l6jnj 1/1 Terminating 0 37m
router-kas-us-east-1b-5597f8669c-t8f4s 1/1 Running 0 36m
router-kas-us-east-1b-f7d89fc4b-47wjs 1/1 Terminating 0 37m
router-kas-us-east-1c-77bc49c6fd-7prz8 1/1 Running 0 17m
router-kas-us-east-1c-848d5dfcc7-cn4p8 1/1 Terminating 0 37m
The log of the terminated pod doesn't show anything interesting.
oc logs -f router-kas-694dd4695c-4gg7j
I1121 09:33:34.179282 1 template.go:437] router "msg"="starting router" "version"="majorFromGit: \nminorFromGit: \ncommitFromGit: 11109e4028b69749d6f842a4da682916e0d91d2f\nversionFromGit: 4.0.0-370-g11109e40\ngitTreeState: clean\nbuildDate: 2022-05-12T09:54:12Z\n"
I1121 09:33:34.180763 1 metrics.go:156] metrics "msg"="router health and metrics port listening on HTTP and HTTPS" "address"="0.0.0.0:1936"
I1121 09:33:34.185939 1 router.go:191] template "msg"="creating a new template router" "writeDir"="/var/lib/haproxy"
I1121 09:33:34.185993 1 router.go:273] template "msg"="router will coalesce reloads within an interval of each other" "interval"="1m0s"
I1121 09:33:34.186280 1 router.go:343] template "msg"="watching for changes" "path"="/etc/pki/tls/private"
I1121 09:33:34.186355 1 router.go:262] router "msg"="router is including routes in all namespaces"
RTS and SREP were unable to rsh into the container although it is unclear whether this was a permissions issue or something else. We haven't checked the state of the host process.
This issue doesn't appear to service impacting.
- is related to
-
MGDSTRM-9181 Ingress disconnects established connections whenever kafka instances are provisioned/deprovisioned
- Closed