-
Epic
-
Resolution: Unresolved
-
Major
-
None
-
Handle graceful shutdown/delete of neutron pods
-
5
-
False
-
-
False
-
Not Selected
-
?
-
?
-
To Do
-
?
-
?
-
0
-
0%
-
-
-
Neutron QE 2024Q2
-
Networking; Neutron
This is created as split from other Jira ticket[1] which was made global as impacting various services and resolution was specific to use of dumb-init which is not the case with neutron pods.
When deleting neutron-api it will stay in TERMINATING state for the duration set on terminationGracePeriodSeconds (30 seconds), modifying this value won't change anything, it will take all the time given by terminationGracePeriodSeconds(or -grace-period), like:
$ time oc delete pod neutron-5fc78456cf-2dl2m --grace-period=100
pod "neutron-5fc78456cf-2dl2m" deleted
real 1m41.416s
user 0m0.225s
sys 0m0.055s
The pod is running two containers, one neutron-httpd and other neutron-api, neutron-api is not getting stopped for some reason.
SIGTERM signal is received by the neutron-api process and forwarded to the child processes(api/rpc/maintenance/etc). api processes are not getting stopped and thus pod not killed until grace-period is hit [1],),] a possible related old issue https://bugs.launchpad.net/neutron/+bug/1815871
Issue can be seen during scale down, config rollovers, etc,but as the default timeout is 30 second it may get unnoticed.