Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-224

[2314998] [ODF on ROSA HCP] MDSCacheUsageHigh not found with active node drained

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • odf-4.19
    • odf-4.17
    • ocs-operator
    • None
    • None

      Description of problem (please be detailed as possible and provide log
      snippests):
      During the test execution of the test_mds_cache_alert_with_active_node_drain we were running metadata io with cephfs by steps:
      1. Create PVC with Cephfs, access mode RWX
      2. Create dc pod with Fedora image
      3. Copy helper_scripts/meta_data_io.py to Fedora dc pod
      4. Run meta_data_io.py on fedora pod
      script can be found by link https://github.com/red-hat-storage/ocs-ci/blob/e4bcbb284280862d03b7f6b5ab2b40e2727482f3/ocs_ci/templates/workloads/helper_scripts/meta_data_io.py

      This script triggers high cache usage in scenario when standby-replay mds scaled down, but does not trigger when active node drained, showing the problem is related to active mds node disruption happens

      Version of all relevant components (if applicable):
      OC version:
      Client Version: 4.16.11
      Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
      Server Version: 4.16.12
      Kubernetes Version: v1.29.8+f10c92d

      OCS version:
      ocs-operator.v4.16.2-rhodf OpenShift Container Storage 4.16.2-rhodf ocs-operator.v4.16.1-rhodf Succeeded

      ODF operator full version: 4.16.2-4

      Cluster version:
      NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
      version 4.16.12 True False 12h Error while reconciling 4.16.12: the cluster operator insights is not available

      Does this issue impact your ability to continue to work with the product
      (please explain in detail what is the user impact)?
      pottentially

      Is there any workaround available to the best of your knowledge?
      no

      Rate from 1 - 5 the complexity of the scenario you performed that caused this
      bug (1 - very simple, 5 - very complex)?
      3

      Can this issue reproducible?
      1/1

      Can this issue reproduce from the UI?
      no

      If this is a regression, please provide more details to justify this:
      new deployment. Tech preview

      Steps to Reproduce:
      1. Deploy ROSA HCP cluster with ODF and run test_mds_cache_alert_with_active_node_drain
      2.
      3.

      Actual results:
      There was not found alert MDSCacheUsageHigh

      Expected results:
      MDSCacheUsageHigh is fired when conditions met

      Additional info:
      cluster to capture necessary data will be created upon request to qe

              kmajumder@redhat.com Kaustav Majumder
              rh-ee-dosypenk Daniel Osypenko
              Kaustav Majumder
              Daniel Osypenko Daniel Osypenko
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

                Created:
                Updated: