Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: 4.18.z
Affects Version/s: 4.18.z
Component/s: Node / CRI-O
Labels:

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
None
Deployment Environment:
Production

Target Backport Versions:
None
Target Version:

4.18.z
Release Blocker:
None
Sprint:
Node Green Sprint 280, OCP Node Core Sprint 282, OCP Node Core Sprint 283, OCP Node Core Sprint 284
sprint_count:
4

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
In Progress
Release Note Type:
Release Note Not Required
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

   In this partners MicroShift deployment they have noticed that after the node reboots, two postgres pods fail to start back up properly and are hitting:

Oct 20 09:25:58 spgttmicro-os.schuler.de microshift[3534]: kubelet E1020 09:25:58.438702    3534 log.go:32] "StopPodSandbox from runtime service failed" err=<
Oct 20 09:25:58 spgttmicro-os.schuler.de microshift[3534]:         rpc error: code = Unknown desc = failed to destroy network for pod sandbox k8s_dps-infra-postgres-repo-host-0_dps_8b4cd4e8-32db-48a3-891b-26ae090370f0_0(d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3): error removing pod dps-infra-postgres-repo-host-0 from CNI network "ovn-kubernetes": plugin type="ovn-k8s-cni-overlay" name="ovn-kubernetes" failed (delete): CNI request failed with status 400: '[<namespace>/dps-infra-postgres-repo-host-0 d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3 network default NAD default] [<namespace>/dps-infra-postgres-repo-host-0 d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3 network default NAD default] failed to get container namespace for pod <namespace>/dps-infra-postgres-repo-host-0 NAD default: failed to Statfs "": no such file or directory
Oct 20 09:25:58 spgttmicro-os.schuler.de microshift[3534]:         ': stat netns path "": stat : no such file or directory
Oct 20 09:25:58 spgttmicro-os.schuler.de microshift[3534]:  > podSandboxID="d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3"
Oct 20 09:25:58 spgttmicro-os.schuler.de microshift[3534]: kubelet E1020 09:25:58.438773    3534 kuberuntime_manager.go:1479] "Failed to stop sandbox" podSandboxID={"Type":"cri-o","ID":"d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3"}
Oct 20 09:25:58 spgttmicro-os.schuler.de microshift[3534]: kubelet E1020 09:25:58.438817    3534 kuberuntime_manager.go:1079] "killPodWithSyncResult failed" err="failed to \"KillPodSandbox\" for \"8b4cd4e8-32db-48a3-891b-26ae090370f0\" with KillPodSandboxError: \"rpc error: code = Unknown desc = failed to destroy network for pod sandbox k8s_dps-infra-postgres-repo-host-0-dps_8b4cd4e8-32db-48a3-891b-26ae090370f0_0(d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3): error removing pod dps-infra-postgres-repo-host-0 from CNI network \\\"ovn-kubernetes\\\": plugin type=\\\"ovn-k8s-cni-overlay\\\" name=\\\"ovn-kubernetes\\\" failed (delete): CNI request failed with status 400: '[<namespace>/dps-infra-postgres-repo-host-0 d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3 network default NAD default] [<namespace>/dps-infra-postgres-repo-host-0 d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3 network default NAD default] failed to get container namespace for pod <namespace>/dps-infra-postgres-repo-host-0 NAD default: failed to Statfs \\\"\\\": no such file or directory\\n': stat netns path \\\"\\\": stat : no such file or directory\""
Oct 20 09:25:58 spgttmicro-os.schuler.de microshift[3534]: kubelet E1020 09:25:58.438855    3534 pod_workers.go:1301] "Error syncing pod, skipping" err="failed to \"KillPodSandbox\" for \"8b4cd4e8-32db-48a3-891b-26ae090370f0\" with KillPodSandboxError: \"rpc error: code = Unknown desc = failed to destroy network for pod sandbox k8s_dps-infra-postgres-repo-host-0_8b4cd4e8-32db-48a3-891b-26ae090370f0_0(d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3): error removing pod dps-infra-postgres-repo-host-0 from CNI network \\\"ovn-kubernetes\\\": plugin type=\\\"ovn-k8s-cni-overlay\\\" name=\\\"ovn-kubernetes\\\" failed (delete): CNI request failed with status 400: '[<namespace>/dps-infra-postgres-repo-host-0 d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3 network default NAD default] [<namespace>/dps-infra-postgres-repo-host-0 d21853b5e0c87d279d08f2484bd5d7b8c1a13788d625cc2f2301fcc83532c2e3 network default NAD default] failed to get container namespace for pod <namespace>/dps-infra-postgres-repo-host-0 NAD default: failed to Statfs \\\"\\\": no such file or directory\\n': stat netns path \\\"\\\": stat : no such file or directory\"" pod="<namespace>/dps-infra-postgres-repo-host-0" podUID="8b4cd4e8-32db-48a3-891b-26ae090370f0"

Version-Release number of selected component (if applicable):

    cri-o: 1.31.11-2.rhaos4.18.git65ec77a.el9
    microshift: 4.18

How reproducible:

    Always

Steps to Reproduce:

    1. With all the pods up & running reboot the node
    2. Wait for the node to come back up and for the pods to be recreated, impacted pods never come back up

Actual results:

    Impacted pods never manage to come back up on their own and need to be forcefully deleted so a new one can be created

Expected results:

    Pods should be able to come back up without intervention

Additional info:

    Looks to be similar to OCPBUGS-58229 however in that bug the cri-o version was 1.33, in our case is 1.31.11

depends on

OCPBUGS-58229 MicroShift: Pod in offline scenario does not start after reboot after bumping CRIO to 1.33.1

Closed

is related to

OCPBUGS-58229 MicroShift: Pod in offline scenario does not start after reboot after bumping CRIO to 1.33.1

Closed

links to

cri-o/cri-o#9617: [release-1.32] OCPBUGS-63432: server: Fix network cleanup failures when NetNS path is empty

Assignee:: Peter Hunt

Reporter:: Marius Paulica Nicolae

Need Info From:: John George, Neelesh Agrawal, Shefali Bansal

Contributors:: None

QA Contact:: John George

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Created:: 2025/10/22 4:21 PM

Updated:: 2026/02/09 6:11 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates