-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.11.0
-
None
-
None
-
3
-
Sprint 227, Sprint 228, Sprint 229, Sprint 230
-
4
-
Rejected
-
False
-
-
-
Bug Fix
Description of problem:
After the enabling the FIPS in S390x , the ingress controller is repeatedly going into the degraded state. However the observation here is the ingress controller is in running state after a few failure, but it keep recreating the pod and the operator status showing as degraded.
Version-Release number of selected component (if applicable):
OCP Version: 4.11.0-rc.2
How reproducible:
Enable FIPS: True in image-config file
Steps to Reproduce:
1. Enable FIPS: True in image-config file before the installation.
2.
3. oc get co
Actual results:
oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.11.0-rc.2 True False False 7h29m
baremetal 4.11.0-rc.2 True False False 4d12h
cloud-controller-manager 4.11.0-rc.2 True False False 4d12h
cloud-credential 4.11.0-rc.2 True False False 4d12h
cluster-autoscaler 4.11.0-rc.2 True False False 4d12h
config-operator 4.11.0-rc.2 True False False 4d12h
console 4.11.0-rc.2 True False False 4d11h
csi-snapshot-controller 4.11.0-rc.2 True False False 4d12h
dns 4.11.0-rc.2 True False False 4d12h
etcd 4.11.0-rc.2 True False False 4d11h
image-registry 4.11.0-rc.2 True False False 4d11h
ingress 4.11.0-rc.2 True False True 4d11h The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: PodsScheduled=False (PodsNotScheduled: Some pods are not scheduled: Pod "router-default-84689cdc5f-r87hs" is not yet scheduled: SchedulerError: binding rejected: running Bind plugin "DefaultBinder": Operation cannot be fulfilled on pods/binding "router-default-84689cdc5f-r87hs": pod router-default-84689cdc5f-r87hs is already assigned to node "worker-0.ocp-m1317001.lnxero1.boe" Pod "router-default-84689cdc5f-8z2fh" is not yet scheduled: SchedulerError: binding rejected: running Bind plugin "DefaultBinder": Operation cannot be fulfilled on pods/binding "router-default-84689cdc5f-8z2fh": pod router-default-84689cdc5f-8z2fh is already assigned to node "worker-0.ocp-m1317001.lnxero1.boe" Pod "router-default-84689cdc5f-s7z96" is not yet scheduled: SchedulerError: binding rejected: running Bind plugin "DefaultBinder": Operation cannot be fulfilled on pods/binding "router-default-84689cdc5f-s7z96": pod router-default-84689cdc5f-s7z96 is already assigned to node "worker-0.ocp-m1317001.lnxero1.boe" Pod "router-default-84689cdc5f-hslhn" is not yet scheduled: SchedulerError: binding rejected: running Bind plugin "DefaultBinder": Operation cannot be fulfilled on pods/binding "router-default-84689cdc5f-hslhn": pod router-default-84689cdc5f-hslhn is already assigned to node "worker-0.ocp-m1317001.lnxero1.boe" Pod "router-default-84689cdc5f-nf9vt" is not yet scheduled: SchedulerError: binding rejected: running Bind plugin "DefaultBinder": Operation cannot be fulfilled on pods/binding "router-default-84689cdc5f-nf9vt": pod router-default-84689cdc5f-nf9vt is already assigned to node "worker-0.ocp-m1317001.lnxero1.boe" Pod "router-default-84689cdc5f-mslzf" is not yet scheduled: SchedulerError: binding rejected: running Bind plugin "DefaultBinder": Operation cannot be fulfilled on pods/binding "router-default-84689cdc5f-mslzf": pod router-default-84689cdc5f-mslzf is already assigned to node "worker-0.ocp-m1317001.lnxero1.boe" Pod "router-default-84689cdc5f-mc8th" is not yet scheduled: SchedulerError: binding rejected: running Bind plugin "DefaultBinder": Operation cannot be fulfilled on pods/binding "router-default-84689cdc5f-mc8th": pod router-default-84689cdc5f-mc8th is already assigned to node "worker-0.ocp-m1317001.lnxero1.boe")
insights 4.11.0-rc.2 True False False 4d12h
kube-apiserver 4.11.0-rc.2 True False False 4d11h
kube-controller-manager 4.11.0-rc.2 True False False 4d12h
kube-scheduler 4.11.0-rc.2 True False False 4d12h
kube-storage-version-migrator 4.11.0-rc.2 True False False 4d11h
machine-api 4.11.0-rc.2 True False False 4d12h
machine-approver 4.11.0-rc.2 True False False 4d12h
machine-config 4.11.0-rc.2 True False False 4d12h
marketplace 4.11.0-rc.2 True False False 4d12h
monitoring 4.11.0-rc.2 True False False 4d11h
network 4.11.0-rc.2 True False False 4d12h
node-tuning 4.11.0-rc.2 True False False 4d11h
openshift-apiserver 4.11.0-rc.2 True False False 4d11h
openshift-controller-manager 4.11.0-rc.2 True False False 4d12h
openshift-samples 4.11.0-rc.2 True False False 4d11h
operator-lifecycle-manager 4.11.0-rc.2 True False False 4d12h
operator-lifecycle-manager-catalog 4.11.0-rc.2 True False False 4d12h
operator-lifecycle-manager-packageserver 4.11.0-rc.2 True False False 4d11h
service-ca 4.11.0-rc.2 True False False 4d12h
storage 4.11.0-rc.2 True False False 4d12h
Expected results:
oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.11.0-rc.2 True False False 9d
baremetal 4.11.0-rc.2 True False False 13d
cloud-controller-manager 4.11.0-rc.2 True False False 13d
cloud-credential 4.11.0-rc.2 True False False 13d
cluster-autoscaler 4.11.0-rc.2 True False False 13d
config-operator 4.11.0-rc.2 True False False 13d
console 4.11.0-rc.2 True False False 13d
csi-snapshot-controller 4.11.0-rc.2 True False False 13d
dns 4.11.0-rc.2 True False False 13d
etcd 4.11.0-rc.2 True False False 13d
image-registry 4.11.0-rc.2 True False False 13d
ingress 4.11.0-rc.2 True False False 13d
insights 4.11.0-rc.2 True False False 13d
kube-apiserver 4.11.0-rc.2 True False False 13d
kube-controller-manager 4.11.0-rc.2 True False False 13d
kube-scheduler 4.11.0-rc.2 True False False 13d
kube-storage-version-migrator 4.11.0-rc.2 True False False 13d
machine-api 4.11.0-rc.2 True False False 13d
machine-approver 4.11.0-rc.2 True False False 13d
machine-config 4.11.0-rc.2 True False False 13d
marketplace 4.11.0-rc.2 True False False 13d
monitoring 4.11.0-rc.2 True False False 13d
network 4.11.0-rc.2 True False False 13d
node-tuning 4.11.0-rc.2 True False False 13d
openshift-apiserver 4.11.0-rc.2 True False False 13d
openshift-controller-manager 4.11.0-rc.2 True False False 13d
openshift-samples 4.11.0-rc.2 True False False 13d
operator-lifecycle-manager 4.11.0-rc.2 True False False 13d
operator-lifecycle-manager-catalog 4.11.0-rc.2 True False False 13d
operator-lifecycle-manager-packageserver 4.11.0-rc.2 True False False 13d
service-ca 4.11.0-rc.2 True False False 13d
storage 4.11.0-rc.2 True False False 13d
Additional info:
Attached the Running ingress controller logs.
The failed ingress controller pod is repeatedly creating in openshift-ingress namespaces.
looks like two ingress controller pod is in running state, but the other failed pods were not cleaned up. So manually delete the failed pods fixed the issue.
- oc get pods -n openshift-ingress | wc -l
451
- oc get pods -n openshift-ingress | grep Running
router-default-84689cdc5f-9j44t 1/1 Running 4 (4d12h ago) 4d12h
router-default-84689cdc5f-qn4gh 1/1 Running 3 (4d12h ago) 4d12h
- oc get pods -n openshift-ingress | grep -v Running | wc -l
449