Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-61789

RTE pods stuck in CrashLoopBackOff due to selinux context changes

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • 4.17.z, 4.16.z
    • Containers
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • Yes
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      4.17.z and 4.16.z OCP versions
       is now expecting the pod resources socket context to be kubelet_var_lib_t instead of container_var_lib_t which causes the RTE pods that is deployed when installing the NROP operator to be stuck on CrashLoopBackOff 
          

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      Always
          

      Steps to Reproduce:

          1. Deploy the NROP operator and when the RTE pods should come up after applying the NROP CR it will be stuck on CrashLoopBackOff
          

      Actual results:

      numaresources-controller-manager-6c74699cf7-7hkxq   1/1     Running            0              33m
      numaresourcesoperator-worker-f5dr6                  1/2     CrashLoopBackOff   9 (29s ago)    21m
      numaresourcesoperator-worker-rc2gr                  1/2     Error              9 (5m6s ago)   21m
      secondary-scheduler-65557fc7cd-cx7gl                1/1     Running            0              20m
          

      Expected results:

      Expected results is for the RTE pods (numaresources-worker pods) to be Running and 2/2 for each worker in this example it's two.
          

      Additional info:

      [root@ocp-edge41 ~]# oc get clusterversion
      NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.17.0-0.nightly-2025-09-15-035144   True        False         116m    Cluster version is 4.17.0-0.nightly-2025-09-15-035144
      [root@ocp-edge41 ~]# oc get no,mcp
      NAME                                                            STATUS   ROLES                  AGE    VERSION
      node/ocp4171598887-ctlplane-0.libvirt.lab.eng.tlv2.redhat.com   Ready    control-plane,master   145m   v1.30.14
      node/ocp4171598887-ctlplane-1.libvirt.lab.eng.tlv2.redhat.com   Ready    control-plane,master   146m   v1.30.14
      node/ocp4171598887-ctlplane-2.libvirt.lab.eng.tlv2.redhat.com   Ready    control-plane,master   146m   v1.30.14
      node/ocp4171598887-worker-0.libvirt.lab.eng.tlv2.redhat.com     Ready    worker                 128m   v1.30.14
      node/ocp4171598887-worker-1.libvirt.lab.eng.tlv2.redhat.com     Ready    worker                 128m   v1.30.14
      
      NAME                                                         CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
      machineconfigpool.machineconfiguration.openshift.io/master   rendered-master-4d364f16d856959b60b95cf92eaf905c   True      False      False      3              3                   3                     0                      143m
      machineconfigpool.machineconfiguration.openshift.io/worker   rendered-worker-f5a9a81979766c79c554a745f2cfb72a   True      False      False      2              2                   2                     0                      143m
      [root@ocp-edge41 ~]# oc get pods
      NAME                                                READY   STATUS             RESTARTS       AGE
      numaresources-controller-manager-6c74699cf7-7hkxq   1/1     Running            0              80m
      numaresourcesoperator-worker-f5dr6                  1/2     CrashLoopBackOff   18 (63s ago)   68m
      numaresourcesoperator-worker-rc2gr                  1/2     CrashLoopBackOff   18 (36s ago)   68m
      secondary-scheduler-65557fc7cd-cx7gl                1/1     Running            0              67m
      [root@ocp-edge41 ~]# oc logs pod/numaresourcesoperator-worker-f5dr6
      Defaulted container "resource-topology-exporter" out of: resource-topology-exporter, shared-pool-container
      I0916 14:19:12.878899       1 main.go:66] starting resource-topology-exporter 0.0.1-dev1 63405e44f go1.22.12 (Red Hat 1.22.12-3.el9_5) X:strictfipsruntime
      I0916 14:19:12.879138       1 main.go:307] using Topology Manager scope "container" from "conf" (conf=container) policy "single-numa-node" from "conf" (conf=single-numa-node)
      I0916 14:19:12.879684       1 client.go:43] creating a podresources client for endpoint "unix:///host-podresources/kubelet.sock"
      I0916 14:19:12.879696       1 client.go:104] endpoint "unix:///host-podresources/kubelet.sock" -> protocol="unix" path="/host-podresources/kubelet.sock"
      I0916 14:19:12.879978       1 client.go:48] created a podresources client for endpoint "unix:///host-podresources/kubelet.sock"
      I0916 14:19:12.879989       1 setup.go:90] metrics endpoint disabled
      I0916 14:19:12.879993       1 podexclude.go:99] > POD excludes:
      I0916 14:19:12.879999       1 resourcetopologyexporter.go:127] using given Topology Manager policy "single-numa-node" scope "container"
      I0916 14:19:12.880035       1 notification.go:123] added interval every 10s
      I0916 14:19:12.880055       1 resourcemonitor.go:153] resource monitor for "ocp4171598887-worker-1.libvirt.lab.eng.tlv2.redhat.com" starting
      I0916 14:19:12.896159       1 resourcemonitor.go:175] tracking node resources
      F0916 14:19:12.896586       1 main.go:118] failed to execute: failed to initialize ResourceMonitor: error while updating node allocatable: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /host-podresources/kubelet.sock: connect: permission denied"
          

              rhn-support-jnovy Jindrich Novy
              rh-ee-rshemtov Roy Shemtov
              None
              None
              Roy Shemtov Roy Shemtov
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: