-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.18
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
No
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
NodeNetworkConfigurationPolicy (NNCP) status field is empty when managing large numbers of VLANs (1000+). The handler pods report etcdserver: request is too large errors preventing NodeNetworkState updates.
When applying a NodeNetworkConfigurationPolicy to delete 50 VLANs on a cluster that already has 1500+ VLANs configured, the NNCP object shows empty STATUS and REASON fields.
$ oc get nncp NAME STATUS REASON delete-50-vlans-batch-1000-1049
The STATUS and REASON columns are empty (not "Progressing", not "Available", completely empty).
Version-Release number of selected component (if applicable):
4.18.0-0.nightly-2025-09-25-164655
OS_GIT_VERSION=4.18.0-202509101149.p2.g53c5d9a.assembly.stream.el9-53c5d9a
SOURCE_GIT_TREE_STATE=clean
OS_GIT_COMMIT=53c5d9a
SOURCE_GIT_COMMIT=53c5d9ac6f9f10b45e8a31691b03ce3a4f86bad2
SOURCE_GIT_TAG=v0.17.0-2608-g53c5d9ac6
SOURCE_GIT_URL=https://github.com/openshift/kubernetes-nmstate
How reproducible:
Once
Steps to Reproduce:
- Configure cluster with 1500+ VLANs using multiple NNCPs
- Apply NNCP to delete 50 VLANs (delete-50-vlans-batch-1000-1049)
Name: delete-50-vlans-batch-1000-1049 Namespace: Labels: <none> Annotations: nmstate.io/webhook-mutating-timestamp: 1759166377092333775 API Version: nmstate.io/v1 Kind: NodeNetworkConfigurationPolicy Metadata: Creation Timestamp: 2025-09-29T17:19:37Z Generation: 1 Resource Version: 1013759 UID: 79ea7108-2b05-40a7-aa6b-547c976492ff Spec: Desired State: Interfaces: Name: eno1.1000 State: absent Type: vlan Name: eno1.1001 State: absent Type: vlan [... 48 more VLANs from eno1.1002 through eno1.1049 ...] Node Selector: node-role.kubernetes.io/worker: Events: <none>
- Observe that NNCP creation timestamp is 2025-09-29T17:19:37Z
- Check status at 2025-09-29T17:24:42Z (5 minutes later) - empty
- Check status at 2025-09-29T17:34:57Z (15 minutes later) - still empty
Actual results:
nmstate Warning
[2025-09-29T14:34:10Z WARN nmstate::query_apply::net_state] Interfaces count exceeds the support limit 1000 in desired state
Source: https://github.com/nmstate/nmstate/blob/base/rust/src/lib/query_apply/net_state.rs#L32
Constant: MAX_SUPPORTED_INTERFACES = 1000
etcd Request Size Error
Handler pod nmstate-handler-dg46c on node master-0 logs 7 occurrences of this error:
2025-09-29T17:51:15.078Z {"level":"error","ts":"2025-09-29T17:51:15.078Z","msg":"Reconciler error","controller":"NodeNetworkState","object":{"name":"master-0"},"namespace":"","name":"master-0","reconcileID":"7bb31f59-50de-43e2-b8ef-a4675cbcd345","error":"error at node reconcile creating NodeNetworkState: Error updating nodeNetworkState: etcdserver: request is too large"}
Stack trace:
etcdserver: request is too large Error updating nodeNetworkState github.com/nmstate/kubernetes-nmstate/pkg/client.UpdateCurrentState /go/src/github.com/openshift/kubernetes-nmstate/pkg/client/client.go:118 github.com/nmstate/kubernetes-nmstate/pkg/client.CreateOrUpdateNodeNetworkState /go/src/github.com/openshift/kubernetes-nmstate/pkg/client/client.go:93 github.com/nmstate/kubernetes-nmstate/controllers/handler.(*NodeReconciler).Reconcile /go/src/github.com/openshift/kubernetes-nmstate/controllers/handler/node_controller.go:112 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile /go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:122 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:323 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 runtime.goexit /usr/lib/golang/src/runtime/asm_amd64.s:1695 error at node reconcile creating NodeNetworkState github.com/nmstate/kubernetes-nmstate/controllers/handler.(*NodeReconciler).Reconcile /go/src/github.com/openshift/kubernetes-nmstate/controllers/handler/node_controller.go:114 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile /go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:122 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:323 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 runtime.goexit /usr/lib/golang/src/runtime/asm_amd64.s:1695
Expected Results:
Add/delete 300 VLANs should work.
NNCP should always show status conditions
NAME STATUS REASON delete-50-vlans-batch-1000-1049 Progressing ConfigurationProgressing
Additional info:
VLAN NNCP
apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: create-500-vlans-part-1000 spec: nodeSelector: node-role.kubernetes.io/worker: '' desiredState: interfaces: - name: eno1.1000 type: vlan state: up mtu: 1400 mac-address: 02:00:00:03:e8:00 ipv4: enabled: true dhcp: true address: - ip: 192.0.2.100 prefix-length: 24 vlan: id: 1000 base-iface: eno1 - name: eno1.1001 type: vlan state: up mtu: 1400 mac-address: 02:00:00:03:e9:01 ipv4: enabled: true dhcp: true address: - ip: 192.0.2.100 prefix-length: 24 vlan: id: 1001 base-iface: eno1
Error Timeline
2025-09-29T17:19:37Z NNCP created 2025-09-29T17:51:15Z First etcd size error 2025-09-29T17:52:24Z etcd size error 2025-09-29T17:53:10Z etcd size error 2025-09-29T17:53:32Z etcd size error 2025-09-29T17:53:54Z etcd size error 2025-09-29T17:54:17Z etcd size error 2025-09-29T17:54:41Z etcd size error
All 7 errors from pod nmstate-handler-dg46c reconciling master-0 NodeNetworkState.
Related NNCP That Triggered Initial Interface Limit
Policy create-1500-vlans-part-1000 failed with nmstate warning:
2025-09-29T17:58:26.464Z {"level":"error","ts":"2025-09-29T17:58:26.464Z","logger":"controllers.NodeNetworkConfigurationPolicy","msg":"Rolling back network configuration, manual intervention needed: ","nodenetworkconfigurationpolicy":{"name":"create-1500-vlans-part-1000"},"error":"error reconciling NodeNetworkConfigurationPolicy on node master-0 at desired state apply: \"\",\n failed to execute nmstatectl apply --no-commit --timeout 480: 'exit status 1' '' '[2025-09-29T14:34:10Z INFO nmstatectl] Nmstate version: 2.2.48 [2025-09-29T14:34:10Z INFO nmstate::nm::show] Got unsupported interface type generic: genev_sys_6081, ignoring [2025-09-29T14:34:10Z WARN nmstate::query_apply::net_state] Interfaces count exceeds the support limit 1000 in desired state
Handler logs: namespaces/openshift-nmstate/pods/nmstate-handler-dg46c/nmstate-handler/nmstate-handler/logs/current.log
- is related to
-
OCPBUGS-60681 k-nmstate deadlocks when environment state is complex
-
- Closed
-
- relates to
-
CNV-4621 vlan-filtering in kubernetes-nmstate
-
- New
-