Description of problem:
When an OLM catalog source pod is scheduled to run on a node and that node is later drained it will result in the following error: There are pending nodes to be drained: worker-075 error: cannot delete Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet (use --force to override): openshift-marketplace/webscale-operators-7jdpv (original report closed) https://github.com/operator-framework/operator-lifecycle-manager/issues/1514 (closed report because duplicate) https://github.com/operator-framework/operator-lifecycle-manager/pull/2814 Based on report history this issue has been swept under the rug for the past 2-3 years. https://github.com/operator-framework/operator-lifecycle-manager/issues/2709 The bug that we're encountering and documented in those github issues is that OLM's catalogsource pods are not managed by anything or one of the following resources ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet The pods are created standalone and this could be addressed in future releases of operator-lifecycle-manager. If this was the case the nodes would drain without requiring the use of --force. Which is the expected and desired output.
Version-Release number of selected component (if applicable):
How reproducible:
Very
Steps to Reproduce:
1. Pod 'openshift-marketplace/webscale-operators-7jdpv' lives on worker 2. Worker starts to drain in anticipation for reboot but gets stuck draining 3. User has to manually force drain the worker $ oc adm drain <node> --force --grace-period=0 --ignore-daemonsets --delete-emptydir-data --disable-eviction
Actual results:
Node gets stuck waiting to drain
Expected results:
Node drains successfully and reboots
Additional info:
- duplicates
-
OCPBUGS-7431 openshift-marketplace pods with no 'controller: true' ownerReferences
- Closed