-
Feature
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
False
-
None
-
False
-
Not Selected
-
0
Proposed title of this feature request
Ability to apply resource limits on LokiStack deployment sizes
What is the nature and description of the request?
In specific environments (mostly multi-tenant), it's required to apply resources limits to all possible workloads Given that OpenShift Container Platform 4 - Cluster Logging with LokiStack does provide tailored deployment sizes, it's expected that those options also apply resource limits or at least provide an option to apply those if desired.
Why does the customer need this? (List the business requirements)
While reasoning for not applying resource limits is understood, there is still a desire to have options available to apply limits for OpenShift Container Platform 4 - Cluster Logging resources. How it's done or whether it's potentially bound to autoscaling capabilities or similar does not matter. Key though is that resources can be limited to comply with recommended practice and general workload applied in specific environments (multi-tenant or where resources are limited)..
List any affected packages or components.
OpenShift Container Platform 4 - Cluster Logging LokiStack
In addition to the previous points, I would like to comment two use cases for this RFE:
1) As the logs consumption is not always the same, it could have an increment of logs and the cluster could have some performance issues to send to the object storage. This implies that the resources in this case of the ingester pod will increase and can leave the node without space and consequently cause several problems in the node. There are support cases with this issue:
message: 'The node was low on resource: memory. Container loki-ingester was using 122378992Ki, which exceeds its request of 30Gi. '
The only workaround is rebooting the pod or cleaning the wall from the ingesters.
2) Another use case is with a 3-node cluster, since the same node acts as master and worker. Therefore Loki pods are allocated in the "master nodes" and if the previous issue occurs the cluster could have master nodes failing.