Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.18
Component/s: Networking / ovn-kubernetes
Labels:
- OVN-Kubernetes
- SDN:Scale
- hcp
- ocp-4
- ocp4
- ovn-k8s-cni-overlay

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Simultaneously creating a large number of pods (e.g., 700-800) on an HCP (Hosted Control Plane) cluster leads to multiple pods remaining in the ContainerCreating state for an extended period. The events for these pods show repeated FailedCreatePodSandBox warnings. This behavior is not reproducible on the equivalent management cluster, suggesting a performance or scaling issue specific to the hosted cluster architecture or its networking stack (OVN-Kubernetes) under stress.

Version-Release number of selected component (if applicable):

How reproducible:

Always

Steps to Reproduce:

1. Create a HCP cluster with the ability to occupy 800+ pods in a single node.

2. Create any deployment.

3. Make sure there is no issue with image pull timeout such as QPS or pulling image from openshift image registry and if possible prefer image from quay.io.

3. Scaling the deployment to 800-1000

Actual results:

Pod stuck in ContainerCreating state for a long time:

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_test-685b4f4bcc-24khb_test-vedant_44b2ed44-52c4-499e-a6c0-4ffb3b3972e6_0(947dab718187b327075cc5cb1f7859a65832f357f286f1294e872646562d9124): error adding pod test-vedant_test-685b4f4bcc-24khb to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"947dab718187b327075cc5cb1f7859a65832f357f286f1294e872646562d9124" Netns:"/var/run/netns/c2d80893-9998-41f9-96b6-0a7d4231c158" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=test-vedant;K8S_POD_NAME=test-685b4f4bcc-24khb;K8S_POD_INFRA_CONTAINER_ID=947dab718187b327075cc5cb1f7859a65832f357f286f1294e872646562d9124;K8S_POD_UID=44b2ed44-52c4-499e-a6c0-4ffb3b3972e6" Path:"" ERRORED: error configuring pod [test-vedant/test-685b4f4bcc-24khb] networking: [test-vedant/test-685b4f4bcc-24khb/44b2ed44-52c4-499e-a6c0-4ffb3b3972e6:ovn-kubernetes]: error adding container to network "ovn-kubernetes": CNI request failed with status 400: '[test-vedant/test-685b4f4bcc-24khb 947dab718187b327075cc5cb1f7859a65832f357f286f1294e872646562d9124 network default NAD default] [test-vedant/test-685b4f4bcc-24khb 947dab718187b327075cc5cb1f7859a65832f357f286f1294e872646562d9124 network default NAD default] failed to configure pod interface: failed to run 'ovs-vsctl --timeout=30 --if-exists clear port 947dab718187b32 qos': exit status 1
      "ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Protocol error)\n"
    '
    ': StdinData: {"binDir":"/var/lib/cni/bin","clusterNetwork":"/host/run/multus/cni/net.d/10-ovn-kubernetes.conf","cniVersion":"0.3.1","daemonSocketDir":"/run/multus/socket","globalNamespaces":"default,openshift-multus,openshift-sriov-network-operator,openshift-cnv","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","namespaceIsolation":true,"type":"multus-shim"}

Expected results:

Pods should get scheduled quickly.

Additional info:

Customers hcp is on agent platform and I tested it on a hcp kubevirt platform. Observed similar behaviour.

Affected Platforms:

Is it an

customer issue / SD

If it is a customer / SD issue:

Adding cluster details and logs in the drive.

Assignee:: Ben Bennett

Reporter:: Vedant Durgam

Need Info From:: None

Contributors:: None

QA Contact:: Anurag Saxena

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/11/26 3:58 PM

Updated:: 2025/11/27 8:51 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates