Loading...

XML

Word

Printable

Type: Bug
Resolution: Won't Do
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: Log Collection
Labels:
- devel_ack+
- service-delivery-prio-asks

Blocked:
False
Blocked Reason:
None
Ready:
False
Docs QE Status:
NEW
QE Status:
NEW
Risk Impact Level:
High

Sprint:
SDA - Sprint 219
Risk Probability:
High (80-100%) - [We are confident it will become an issue]

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Use the rosa cli to install the cluster-logging-operator as addon to the cluster, it will not have the resource limit for cpu and memory set.

$ rosa install addon --cluster bmeng-sts-m1 cluster-logging-operator -i
? Are you sure you want to install add-on 'cluster-logging-operator' on cluster 'bmeng-sts-m1'? Yes
? Use AWS CloudWatch: Yes
? Collect Applications logs: Yes
? Collect Infrastructure logs: Yes
? Collect Audit logs (optional): No
? CloudWatch region (optional): 
I: Add-on 'cluster-logging-operator' is now installing. To check the status run 'rosa list addons -c bmeng-sts-m1'

$ oc get clusterlogging -n openshift-logging instance -o json | jq .spec
{
  "collection": {
    "logs": {
      "fluentd": {
        "resources": {
          "requests": {
            "memory": "736Mi"
          }
        }
      },
      "type": "fluentd"
    }
  },
  "forwarder": {
    "fluentd": {}
  },
  "managementState": "Managed"
}

Which results some of the customer's master node being smashed with the high memory usage for logging components.

Eg:

$ oc adm top po -n openshift-logging 
NAME                                        CPU(cores)   MEMORY(bytes)   
cluster-logging-operator-57b6d6747f-l5tr9   0m           53Mi            
collector-9tf85                             30m          929Mi           
collector-kfbkh                             195m         51424Mi         
collector-lg57f                             119m         49155Mi         
collector-lxr8w                             37m          886Mi           
collector-twxhw                             54m          957Mi           
collector-vjqn8                             595m         32316Mi

Notes 2022-05-16:

This issues has been moved from SDA to LOG project so that the logging team can work on this as the fix needs to come from them according to the discussion in this jira

Acceptance criteria:

CLO sets defualt resource limits for the collector pods along with an option to adjust this limit as a parameter for the logging add-on for managed openshift.

is caused by

LOG-2635 CloudWatch forwarding rejecting large log events, fills tmpfs

Closed

Assignee:: Unassigned

Reporter:: Bo Meng

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Created:: 2022/05/01 3:37 AM

Updated:: 2022/07/27 9:23 PM

Resolved:: 2022/07/27 9:23 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates