- 
    
Bug
 - 
    Resolution: Done-Errata
 - 
    
Major
 - 
    odf-4.15
 - 
    None
 
- 
        False
 - 
        
 - 
        False
 - 
        Committed
 - 
        4.18.0-107
 - 
        Committed
 - 
        
 - 
        Removed Functionality
 - 
        
 - 
        
 
- 
        -
 - 
        None
 
Description of problem (please be detailed as possible and provide log
snippests):
The customer is experiencing the MDSCacheUsageHigh alert firing. They've applied the fix for this [1], but the alert is still firing. Furthermore, when we look at the memory consumption of the mds pods, it's only at ~25% currently.
[root@bastionocpcrystal ~]# oc rsh -n openshift-storage $(oc get pods -n openshift-storage -o name -l app=rook-ceph-operator)
sh-5.1$ export CEPH_ARGS='-c /var/lib/rook/openshift-storage/openshift-storage.config'
sh-5.1$  ceph config dump | grep mds_cache_memory_limit
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_cache_memory_limit                 8589934592
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_cache_memory_limit                 8589934592
For node msplatform-x9ggd-storage-tnhvb:
  Namespace                               Name                                                               CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                               ----                                                               ------------  ----------  ---------------  -------------  —
  openshift-storage                       rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-75b6d77bggzfx    2 (12%)       2 (12%)     16Gi (25%)       16Gi (25%)     51m
For node msplatform-x9ggd-storage-vh2hj:
  Namespace                               Name                                                               CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                               ----                                                               ------------  ----------  ---------------  -------------  —
  openshift-storage                       rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-7d8c86bdzcrk2    2 (12%)       2 (12%)     16Gi (25%)       16Gi (25%)     52m
Version of all relevant components (if applicable):
ODF 4.15.6
Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
No but it's annoying given that it's appears to be a faulty alert that's firing.
Is there any workaround available to the best of your knowledge?
Not to my knowledge
Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3
Can this issue reproducible?
Unknown
Can this issue reproduce from the UI?
No
If this is a regression, please provide more details to justify this:
Unknown
Steps to Reproduce:
1.
2.
3.
Actual results:
MDSCacheUsageHigh alert firing
Expected results:
MDSCacheUsageHigh does not fire
Additional info:
- It should also be noted that the "mds_cache_memory_limit" value for both mds pods did not increase to half of the mds pod's memory values as it should have. I had to set the "mds_cache_memory_limit" to "8589934592" manually using the rook-ceph-tools pod. This still didn't resolve the misfiring alert.
 
- Ceph is HEALTH_OK:
 
  cluster:
    id:     05c475dc-e78e-4f2b-94c1-d97e7c6859fa
    health: HEALTH_OK
- is cloned by
 - 
                    
DFBUGS-1033 [Backport to 4.17][2313424][GSS] MDSCacheUsageHigh alert firing
-         
 - Closed
 
 -         
 - 
                    
DFBUGS-1034 [Backport to 4.15][2313424][GSS] MDSCacheUsageHigh alert firing
-         
 - Closed
 
 -         
 
- external trackers
 
- links to
 - 
                    
        
        RHBA-2024:138027
        Red Hat OpenShift Data Foundation 4.18 security, enhancement & bug fix update