-
Bug
-
Resolution: Unresolved
-
Critical
-
odf-4.12
-
None
Description of problem (please be detailed as possible and provide log
snippests):
Storage nodes run out of capacity for ceph osds when a node role with the `infra` is applied causing infra related components to run on these nodes which take up additional CPU/memory resources
Version of all relevant components (if applicable):
4.12+
Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Customer applies the `infra` node role to the storage worker nodes following best practices so these nodes do not count towards their OCP subscription/entitlement per [1]
Is there any workaround available to the best of your knowledge?
None
Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
2
Can this issue reproducible?
Always
Can this issue reproduce from the UI?
Unsure
If this is a regression, please provide more details to justify this:
Steps to Reproduce:
1. Configure ODF and a Storage system and projects using the Ceph luns
2. Setup node labels to use the `infra` role on the storage nodes
3. Setup the default router pods to run only on `infra` nodes, migrate the router pod workloads to the storage nodes
4. Bump up the traffic going to the router pods which take up more CPU/memory from other pods that are running the ODF storage components, Ceph OSDs for example, causing instability to ODF
Actual results:
Ceph becomes unstable, dropping OSDs since it does not have enough CPU/memory to
Expected results:
Include instructions on how to not run any additional `infra` workloads when this node-role is applied to storage nodes.
Additional info:
Quick workaround would be to remove the `infra` node-role on these storage nodes so any `infra` related workload will not run on the storage nodes. Would this be a possible supported and documented option other than having the customer set up taints/tolerations across all `infra` related components?
- external trackers