Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-426

[2265563] Increasing MDS memory is erasing CPU values when pods are in CLBO state.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • odf-4.18
    • odf-4.15
    • ocs-operator
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • ?
    • Hide
      .Increasing MDS memory is erasing CPU values when pods are in CLBO state

      When the metadata server (MDS) memory is increased while the MDS pods are in a crash loop back off (CLBO) state, CPU request or limit for the MDS pods is removed. As a result, the CPU request or the limit that is set for the MDS changes.

      Workaround: Run the `oc patch` command to adjust the CPU limits.
      For example:
      ----
      $ oc patch -n openshift-storage storagecluster ocs-storagecluster \
          --type merge \
          --patch '{"spec": {"resources": {"mds": {"limits": {"cpu": "3"},
          "requests": {"cpu": "3"}}}}}'
      ----
      Show
      .Increasing MDS memory is erasing CPU values when pods are in CLBO state When the metadata server (MDS) memory is increased while the MDS pods are in a crash loop back off (CLBO) state, CPU request or limit for the MDS pods is removed. As a result, the CPU request or the limit that is set for the MDS changes. Workaround: Run the `oc patch` command to adjust the CPU limits. For example: ---- $ oc patch -n openshift-storage storagecluster ocs-storagecluster \     --type merge \     --patch '{"spec": {"resources": {"mds": {"limits": {"cpu": "3"},     "requests": {"cpu": "3"}}}}}' ----
    • Known Issue
    • None

      Description of problem (please be detailed as possible and provide log
      snippests):

      There is no alert triggered for MDSCPUHighUsage after upgrading cluster from 4.14 to 4.15.

      Version of all relevant components (if applicable):

      odf:4.15.0-147
      ocp: 4.15.0-rc.8

      Does this issue impact your ability to continue to work with the product
      (please explain in detail what is the user impact)?

      Yes
      Is there any workaround available to the best of your knowledge?

      Rate from 1 - 5 the complexity of the scenario you performed that caused this
      bug (1 - very simple, 5 - very complex)?
      1

      Can this issue reproducible?
      Yes

      Can this issue reproduce from the UI?

      If this is a regression, please provide more details to justify this:

      Steps to Reproduce:
      1. Deploy cluster with 4.14, run file creation IO and upgrade to 4.15.
      2 Run file creator IO to utilize MDS CPU of 67% and continue the same load at-least for 6hrs [time can be tweaked in prometheus rules yaml to test quickly].
      4. Verify the alert generated of not if the condition met.

      Actual results:

      No alert seen for MDSCPUHighUsage after upgrade

      Expected results:
      Alert should be triggred when the condition met in terms of CPU utilisation after upgrading to 4.15

      Additional info:

              sapillai Santosh Pillai
              rhn-support-nagreddy Nagendra Reddy
              Santosh Pillai
              Nagendra Reddy Nagendra Reddy
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: