-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.19.0
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
Yes
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. Installed a cluster with "spec":{ "clusterNetwork": [
{"cidr":"10.128.0.0/20","hostPrefix":23}2. set up EIP/EFW/NP/Service UDN/UDN LB/externalIP feature before upgrade
3. Expanded cluster network by
$ oc patch Network.config.openshift.io cluster --type='merge' --patch '{ "spec":{ "clusterNetwork": [
{"cidr":"10.128.0.0/19","hostPrefix":23}], "networkType": "OVNKubernetes" }}'
network.config.openshift.io/cluster patched
4. kube-apiserver is degraded, invalid CIDR address: 10.0.15.93 is complained
$ oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.19.0-0.nightly-2025-05-06-051838 True False False 2s
baremetal 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
cloud-controller-manager 4.19.0-0.nightly-2025-05-06-051838 True False False 4h52m
cloud-credential 4.19.0-0.nightly-2025-05-06-051838 True False False 4h52m
cluster-autoscaler 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
config-operator 4.19.0-0.nightly-2025-05-06-051838 True False False 4h50m
console 4.19.0-0.nightly-2025-05-06-051838 True False False 101s
control-plane-machine-set 4.19.0-0.nightly-2025-05-06-051838 True False False 4h47m
csi-snapshot-controller 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
dns 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
etcd 4.19.0-0.nightly-2025-05-06-051838 True False False 4h48m
image-registry 4.19.0-0.nightly-2025-05-06-051838 True False False 4m47s
ingress 4.19.0-0.nightly-2025-05-06-051838 True False False 4m2s
insights 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
kube-apiserver 4.19.0-0.nightly-2025-05-06-051838 True False True 4h44m ConfigObservationDegraded: invalid CIDR address: 10.0.15.93
kube-controller-manager 4.19.0-0.nightly-2025-05-06-051838 True False False 4h45m
kube-scheduler 4.19.0-0.nightly-2025-05-06-051838 True False False 4h47m
kube-storage-version-migrator 4.19.0-0.nightly-2025-05-06-051838 True False False 4h50m
machine-api 4.19.0-0.nightly-2025-05-06-051838 True False False 4h44m
machine-approver 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
machine-config 4.19.0-0.nightly-2025-05-06-051838 True False False 4h47m
marketplace 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
monitoring 4.19.0-0.nightly-2025-05-06-051838 True False False 4h37m
network 4.19.0-0.nightly-2025-05-06-051838 True False False 4h51m
node-tuning 4.19.0-0.nightly-2025-05-06-051838 True False False 3m14s
olm 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
openshift-apiserver 4.19.0-0.nightly-2025-05-06-051838 True False False 4h39m
openshift-controller-manager 4.19.0-0.nightly-2025-05-06-051838 True False False 4h41m
openshift-samples 4.19.0-0.nightly-2025-05-06-051838 True False False 4h39m
operator-lifecycle-manager 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
operator-lifecycle-manager-catalog 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
operator-lifecycle-manager-packageserver 4.19.0-0.nightly-2025-05-06-051838 True False False 4h39m
service-ca 4.19.0-0.nightly-2025-05-06-051838 True False False 4h50m
storage 4.19.0-0.nightly-2025-05-06-051838 True False False 4h49m
following errors are seen in
oc -n openshift-kube-apiserver-operator logs kube-apiserver-operator-789b654f94-n5z27 | grep invalid
.......
.......
.......
E0507 11:18:32.779771 1 base_controller.go:279] "Unhandled Error" err="CertRotationController reconciliation failed: KubeAPIServer.operator.openshift.io \"cluster\" is invalid: status.nodeStatuses[2].currentRevision: Invalid value: \"object\": cannot be unset once set"
E0507 11:18:33.980378 1 base_controller.go:279] "Unhandled Error" err="CertRotationController reconciliation failed: KubeAPIServer.operator.openshift.io \"cluster\" is invalid: status.nodeStatuses[2].currentRevision: Invalid value: \"object\": cannot be unset once set"
E0507 11:18:34.179310 1 base_controller.go:279] "Unhandled Error" err="CertRotationController reconciliation failed: KubeAPIServer.operator.openshift.io \"cluster\" is invalid: status.nodeStatuses[2].currentRevision: Invalid value: \"object\": cannot be unset once set"
E0507 11:18:34.380412 1 base_controller.go:279] "Unhandled Error" err="CertRotationController reconciliation failed: KubeAPIServer.operator.openshift.io \"cluster\" is invalid: status.nodeStatuses[2].currentRevision: Invalid value: \"object\": cannot be unset once set"
E0507 11:18:34.579125 1 base_controller.go:279] "Unhandled Error" err="CertRotationController reconciliation failed: KubeAPIServer.operator.openshift.io \"cluster\" is invalid: status.nodeStatuses[2].currentRevision: Invalid value: \"object\": cannot be unset once set"
E0507 11:18:37.074422 1 base_controller.go:279] "Unhandled Error" err="TargetConfigController reconciliation failed: KubeAPIServer.operator.openshift.io \"cluster\" is invalid: status.nodeStatuses[2].currentRevision: Invalid value: \"object\": cannot be unset once set"
I0507 14:34:50.401242 1 event.go:377] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-apiserver-operator", Name:"kube-apiserver-operator", UID:"0d105bbe-f7d2-49ef-8b41-5614120a740c", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'GetExternalIPPolicyFailed' error parsing networks.config.openshift.io/cluster Spec.ExternalIP.Policy.AllowedCIDRs: invalid cidr: invalid CIDR address: 10.0.15.93
E0507 14:34:50.431632 1 base_controller.go:279] "Unhandled Error" err="ConfigObserver reconciliation failed: invalid CIDR address: 10.0.15.93"
I0507 14:34:50.435764 1 event.go:377] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-apiserver-operator", Name:"kube-apiserver-operator", UID:"0d105bbe-f7d2-49ef-8b41-5614120a740c", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'GetExternalIPPolicyFailed' error parsing networks.config.openshift.io/cluster Spec.ExternalIP.Policy.AllowedCIDRs: invalid cidr: invalid CIDR address: 10.0.15.93
I0507 14:34:50.436426 1 status_controller.go:229] clusteroperator/kube-apiserver diff {"status":{"conditions":[
,{"lastTransitionTime":"2025-05-07T11:34:35Z","message":"NodeInstallerProgressing: 3 nodes are at revision 6","reason":"AsExpected","status":"False","type":"Progressing"},{"lastTransitionTime":"2025-05-07T11:14:42Z","message":"StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 6","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2025-05-07T11:09:08Z","message":"KubeletMinorVersionUpgradeable: Kubelet and API server minor versions are synced.","reason":"AsExpected","status":"True","type":"Upgradeable"},{"lastTransitionTime":"2025-05-07T11:09:39Z","message":"All is well","reason":"AsExpected","status":"False","type":"EvaluationConditionsDetected"}]}}
I0507 14:34:50.499557 1 event.go:377] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-apiserver-operator", Name:"kube-apiserver-operator", UID:"0d105bbe-f7d2-49ef-8b41-5614120a740c", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/kube-apiserver changed: Degraded message changed from "NodeControllerDegraded: All master nodes are ready" to "NodeControllerDegraded: All master nodes are ready\nConfigObservationDegraded: invalid CIDR address: 10.0.15.93"
E0507 14:34:50.505493 1 base_controller.go:279] "Unhandled Error" err="ConfigObserver reconciliation failed: invalid CIDR address: 10.0.15.93"
I0507 14:34:50.508868 1 event.go:377] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-apiserver-operator", Name:"kube-apiserver-operator", UID:"0d105bbe-f7d2-49ef-8b41-5614120a740c", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'GetExternalIPPolicyFailed' error parsing networks.config.openshift.io/cluster Spec.ExternalIP.Policy.AllowedCIDRs: invalid cidr: invalid CIDR address: 10.0.15.93
E0507 14:34:50.514985 1 base_controller.go:279] "Unhandled Error" err="ConfigObserver reconciliation failed: invalid CIDR address: 10.0.15.93"
I0507 14:34:50.522074 1 event.go:377] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-apiserver-operator", Name:"kube-apiserver-operator", UID:"0d105bbe-f7d2-49ef-8b41-5614120a740c", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'GetExternalIPPolicyFailed' error parsing networks.config.openshift.io/cluster Spec.ExternalIP.Policy.AllowedCIDRs: invalid cidr: invalid CIDR address: 10.0.15.93
then, found more operators became degraded:
$ oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.19.0-0.nightly-2025-05-06-051838 True False False 3m41s
baremetal 4.19.0-0.nightly-2025-05-06-051838 True False False 5h28m
cloud-controller-manager 4.19.0-0.nightly-2025-05-06-051838 True False False 5h31m
cloud-credential 4.19.0-0.nightly-2025-05-06-051838 True False False 5h31m
cluster-autoscaler 4.19.0-0.nightly-2025-05-06-051838 True False False 5h28m
config-operator 4.19.0-0.nightly-2025-05-06-051838 True False False 5h29m
console 4.19.0-0.nightly-2025-05-06-051838 True False False 28s
control-plane-machine-set 4.19.0-0.nightly-2025-05-06-051838 True False False 5h26m
csi-snapshot-controller 4.19.0-0.nightly-2025-05-06-051838 True False False 5h28m
dns 4.19.0-0.nightly-2025-05-06-051838 True True False 5h28m DNS "default" reports Progressing=True: "Have 5 available node-resolver pods, want 6."
etcd 4.19.0-0.nightly-2025-05-06-051838 True False False 5h27m
image-registry 4.19.0-0.nightly-2025-05-06-051838 True True False 43m Progressing: The registry is ready...
ingress 4.19.0-0.nightly-2025-05-06-051838 True False False 43m
insights 4.19.0-0.nightly-2025-05-06-051838 True False False 5h28m
kube-apiserver 4.19.0-0.nightly-2025-05-06-051838 True False True 5h23m ConfigObservationDegraded: invalid CIDR address: 10.0.15.93
kube-controller-manager 4.19.0-0.nightly-2025-05-06-051838 True False False 5h24m
kube-scheduler 4.19.0-0.nightly-2025-05-06-051838 True False False 5h26m
kube-storage-version-migrator 4.19.0-0.nightly-2025-05-06-051838 True False False 5h29m
machine-api 4.19.0-0.nightly-2025-05-06-051838 True False False 5h23m
machine-approver 4.19.0-0.nightly-2025-05-06-051838 True False False 5h28m
machine-config 4.19.0-0.nightly-2025-05-06-051838 True False False 5h26m
marketplace 4.19.0-0.nightly-2025-05-06-051838 True False False 5h28m
monitoring 4.19.0-0.nightly-2025-05-06-051838 True False False 5h16m
network 4.19.0-0.nightly-2025-05-06-051838 True True False 5h30m DaemonSet "/openshift-multus/multus" is not available (awaiting 1 nodes)...
node-tuning 4.19.0-0.nightly-2025-05-06-051838 True True False 9s Waiting for 1/6 Profiles to be applied
olm 4.19.0-0.nightly-2025-05-06-051838 True False False 5h28m
openshift-apiserver 4.19.0-0.nightly-2025-05-06-051838 True False False 5h18m
openshift-controller-manager 4.19.0-0.nightly-2025-05-06-051838 True False False 5h20m
openshift-samples 4.19.0-0.nightly-2025-05-06-051838 True False False 5h18m
operator-lifecycle-manager 4.19.0-0.nightly-2025-05-06-051838 True False False 5h28m
operator-lifecycle-manager-catalog 4.19.0-0.nightly-2025-05-06-051838 True False False 5h28m
operator-lifecycle-manager-packageserver 4.19.0-0.nightly-2025-05-06-051838 True False False 5h18m
service-ca 4.19.0-0.nightly-2025-05-06-051838 True False False 5h29m
storage 4.19.0-0.nightly-2025-05-06-051838 True True False 5h28m AWSEBSCSIDriverOperatorCRProgressing: AWSEBSDriverNodeServiceControllerProgressing: Waiting for DaemonSet to deploy node pods
Actual results: several operators are degraded
Expected results: operators should not be degraded
Additional info:
must-gather: https://drive.google.com/file/d/1SxnhxIAUjC99mVjsa4z7WiP8-ERyPNOS/view?usp=drive_link
Please fill in the following template while reporting a bug and provide as much relevant information as possible. Doing so will give us the best chance to find a prompt resolution.
Affected Platforms:
Is it an
- internal CI failure
- customer issue / SD
- internal RedHat testing failure
If it is an internal RedHat testing failure:
- Please share a kubeconfig or creds to a live cluster for the assignee to debug/troubleshoot along with reproducer steps (specially if it's a telco use case like ICNI, secondary bridges or BM+kubevirt).
If it is a CI failure:
- Did it happen in different CI lanes? If so please provide links to multiple failures with the same error instance
- Did it happen in both sdn and ovn jobs? If so please provide links to multiple failures with the same error instance
- Did it happen in other platforms (e.g. aws, azure, gcp, baremetal etc) ? If so please provide links to multiple failures with the same error instance
- When did the failure start happening? Please provide the UTC timestamp of the networking outage window from a sample failure run
- If it's a connectivity issue,
- What is the srcNode, srcIP and srcNamespace and srcPodName?
- What is the dstNode, dstIP and dstNamespace and dstPodName?
- What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
If it is a customer / SD issue:
- Provide enough information in the bug description that Engineering doesn’t need to read the entire case history.
- Don’t presume that Engineering has access to Salesforce.
- Do presume that Engineering will access attachments through supportshell.
- Describe what each relevant attachment is intended to demonstrate (failed pods, log errors, OVS issues, etc).
- Referring to the attached must-gather, sosreport or other attachment, please provide the following details:
- If the issue is in a customer namespace then provide a namespace inspect.
- If it is a connectivity issue:
- What is the srcNode, srcNamespace, srcPodName and srcPodIP?
- What is the dstNode, dstNamespace, dstPodName and dstPodIP?
- What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
- Please provide the UTC timestamp networking outage window from must-gather
- Please provide tcpdump pcaps taken during the outage filtered based on the above provided src/dst IPs
- If it is not a connectivity issue:
- Describe the steps taken so far to analyze the logs from networking components (cluster-network-operator, OVNK, SDN, openvswitch, ovs-configure etc) and the actual component where the issue was seen based on the attached must-gather. Please attach snippets of relevant logs around the window when problem has happened if any.
- When showing the results from commands, include the entire command in the output.
- For OCPBUGS in which the issue has been identified, label with “sbr-triaged”
- For OCPBUGS in which the issue has not been identified and needs Engineering help for root cause, label with “sbr-untriaged”
- Do not set the priority, that is owned by Engineering and will be set when the bug is evaluated
- Note: bugs that do not meet these minimum standards will be closed with label “SDN-Jira-template”
- For guidance on using this template please see
OCPBUGS Template Training for Networking components