Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-54177

[OLMv1] fail to start the OperatorController and Catalogd Pods since hostPath type check failed: /etc/docker is not a directory [release-4.19]

    • Critical
    • None
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required
    • In Progress

      Description of problem:

      The OLMv1 didn't work well. As follows,

          jiazha-mac:~ jiazha$ omg get clusterversion
      NAME     VERSION  AVAILABLE  PROGRESSING  SINCE  STATUS
      version           False      True         1h1m   Unable to apply 4.19.0-0.nightly-multi-2025-03-03-215205: the cluster operator olm is not available
      
      status:
        conditions:
        - lastTransitionTime: "2025-03-04T03:38:05Z"
          message: |-
            CatalogdClusterCatalogOpenshiftCertifiedOperatorsDegraded: Internal error occurred: failed calling webhook "inject-metadata-name.olm.operatorframework.io": failed to call webhook: Post "https://catalogd-service.openshift-catalogd.svc:9443/mutate-olm-operatorframework-io-v1-clustercatalog?timeout=10s": no endpoints available for service "catalogd-service"
            CatalogdClusterCatalogOpenshiftCommunityOperatorsDegraded: Internal error occurred: failed calling webhook "inject-metadata-name.olm.operatorframework.io": failed to call webhook: Post "https://catalogd-service.openshift-catalogd.svc:9443/mutate-olm-operatorframework-io-v1-clustercatalog?timeout=10s": no endpoints available for service "catalogd-service"
            CatalogdClusterCatalogOpenshiftRedhatMarketplaceDegraded: Internal error occurred: failed calling webhook "inject-metadata-name.olm.operatorframework.io": failed to call webhook: Post "https://catalogd-service.openshift-catalogd.svc:9443/mutate-olm-operatorframework-io-v1-clustercatalog?timeout=10s": no endpoints available for service "catalogd-service"
            CatalogdClusterCatalogOpenshiftRedhatOperatorsDegraded: Internal error occurred: failed calling webhook "inject-metadata-name.olm.operatorframework.io": failed to call webhook: Post "https://catalogd-service.openshift-catalogd.svc:9443/mutate-olm-operatorframework-io-v1-clustercatalog?timeout=10s": no endpoints available for service "catalogd-service"
            CatalogdDeploymentCatalogdControllerManagerDegraded: Deployment was progressing too long
            OperatorcontrollerDeploymentOperatorControllerControllerManagerDegraded: Deployment was progressing too long
          reason: CatalogdClusterCatalogOpenshiftCertifiedOperators_SyncError::CatalogdClusterCatalogOpenshiftCommunityOperators_SyncError::CatalogdClusterCatalogOpenshiftRedhatMarketplace_SyncError::CatalogdClusterCatalogOpenshiftRedhatOperators_SyncError::CatalogdDeploymentCatalogdControllerManager_SyncError::OperatorcontrollerDeploymentOperatorControllerControllerManager_SyncError
          status: "True"
          type: Degraded
      
      jiazha-mac:~ jiazha$ omg get pods -n openshift-operator-controller 
      NAME                                                     READY  STATUS   RESTARTS  AGE
      operator-controller-controller-manager-7d9c45f686-d584q  0/1    Pending  0         56m
      jiazha-mac:~ jiazha$ omg get pods -n openshift-catalogd 
      NAME                                         READY  STATUS   RESTARTS  AGE
      catalogd-controller-manager-7b44d8664-578m6  0/1    Pending  0         56m

      After checking the Kubelet log, I found the `hostPath type check failed: /etc/docker is not a directory` error.

      9376:Mar 04 03:35:48.795943 ci-op-wyxv6h86-84a8e-vdtzv-master-0 kubenswrapper[2605]: E0304 03:35:48.795905    2605 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/host-path/be838abb-e275-4ae4-8f4d-f2d1ed3e775e-etc-docker podName:be838abb-e275-4ae4-8f4d-f2d1ed3e775e nodeName:}" failed. No retries permitted until 2025-03-04 03:35:49.295886827 +0000 UTC m=+196.808189333 (durationBeforeRetry 500ms). Error: MountVolume.SetUp failed for volume "etc-docker" (UniqueName: "kubernetes.io/host-path/be838abb-e275-4ae4-8f4d-f2d1ed3e775e-etc-docker") pod "catalogd-controller-manager-7b44d8664-578m6" (UID: "be838abb-e275-4ae4-8f4d-f2d1ed3e775e") : hostPath type check failed: /etc/docker is not a directory 

      Version-Release number of selected component (if applicable):

          4.19.0-0.nightly-multi-2025-03-03-215205

      How reproducible:

      always    

      Steps to Reproduce:

      See Prow jobs:

      https://qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.19-multi-nightly-gcp-ipi-basecap-none-additionalcaps-arm-f7/1896759586044514304 

      https://qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.19-multi-nightly-gcp-ipi-basecap-none-additionalcaps-arm-f7/1895542567416631296

          1.
          2.
          3.
          

      Actual results:

      OLMv1 pods didn't work since hostPath type check failed: /etc/docker is not a directory

          

      Expected results:

      OLMv1 pods work well.

          

      Additional info:

          

              tshort@redhat.com Todd Short
              rhn-support-jiazha Jian Zhang
              Jian Zhang Jian Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: