Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-35446

vsphere-problem-detector - checkDataStoreWithURL fails both in newly installed and freshly upgraded 4.14 clusters

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Major Major
    • 4.16.0
    • 4.14.z
    • Storage
    • Important
    • No
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, in User Provisioned Infrastructure (UPI) or clusters that were upgraded from older versions, `failureDomains` may be missing in Infrastructure objects which caused certain checks to fail. With this release, a fallback `failureDomains` is synthesized from `cloudConfig` if none are available in `infrastructures.config.openshift.io`. (link:https://issues.redhat.com/browse/OCPBUGS-35446[*OCPBUGS-35446*])
      Show
      * Previously, in User Provisioned Infrastructure (UPI) or clusters that were upgraded from older versions, `failureDomains` may be missing in Infrastructure objects which caused certain checks to fail. With this release, a fallback `failureDomains` is synthesized from `cloudConfig` if none are available in `infrastructures.config.openshift.io`. (link: https://issues.redhat.com/browse/OCPBUGS-35446 [* OCPBUGS-35446 *])
    • Bug Fix
    • In Progress

      This is a clone of issue OCPBUGS-35215. The following is the description of the original issue:

      Description of problem:

      • We're seeing [0] in two customers environments, while one of the two confirmed this issue is replicated both in the context of a freshly installed 4.14.26 cluster, as well as an upgraded cluster.
      • Looking at [1] and the changes since 4.13 in the vsphere-problem-detector, I see we introduced some additional vSphere permissions checks in the checkDataStoreWithURL() [2][3] function: it was initially suspected that it was due to [4], but this was backported to 4.14.26, where the customer confirms the issue persists.
      •  

      [0]

      $ omc -n openshift-cluster-storage-operator logs vsphere-problem-detector-operator-78cbc7fdbb-2g9mx | grep -i -e datastore.go -e E0508
      2024-05-08T07:44:05.842165300Z I0508 07:44:05.839356       1 datastore.go:329] checking datastore ds:///vmfs/volumes/vsan:526390016b19d2b5-21ae3fd76fa61150/ for permissions
      2024-05-08T07:44:05.842165300Z I0508 07:44:05.839504       1 datastore.go:125] CheckStorageClasses: thin-csi: storage policy openshift-storage-policy-tc01-rpdd7: unable to find datastore with URL ds:///vmfs/volumes/vsan:526390016b19d2b5-21ae3fd76fa61150/
      2024-05-08T07:44:05.842165300Z I0508 07:44:05.839522       1 datastore.go:142] CheckStorageClasses checked 7 storage classes, 1 problems found
      2024-05-08T07:44:05.848251057Z E0508 07:44:05.848212       1 operator.go:204] failed to run checks: StorageClass thin-csi: storage policy openshift-storage-policy-tc01-rpdd7: unable to find datastore with URL ds:///vmfs/volumes/vsan:526390016b19d2b5-21ae3fd76fa61150/
      [...]
      

      [1] https://github.com/openshift/vsphere-problem-detector/compare/release-4.13...release-4.14
      [2] https://github.com/openshift/vsphere-problem-detector/blame/release-4.14/pkg/check/datastore.go#L328-L344
      [3] https://github.com/openshift/vsphere-problem-detector/pull/119
      [4] https://issues.redhat.com/browse/OCPBUGS-28879

              hekumar@redhat.com Hemant Kumar
              openshift-crt-jira-prow OpenShift Prow Bot
              Wei Duan Wei Duan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: