[OCPBUGS-29919] Logs of runtimecfg node-ip detection too verbose - Red Hat Issue Tracker

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: 4.16.0
Affects Version/s: 4.14
Component/s: Networking / On-Prem Host Networking
Labels:
- pre-merge

Severity:
Moderate
Regression:
No
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Target Version:

4.16.0
Target Backport Versions:

4.14.z, 4.15.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:
PX Priority Data:

==== This Jira covers only baremetal-runtimecfg component with respect to node IP detection ====

Description of problem:

Pods running in the namespace openshift-vsphere-infra are so much verbose printing as INFO messages that should debug.

This excesse of verbosity has an impact in CRIO, in the node and also in the Logging system. 

For instance, having 71 nodes, the number of logs coming from this namespace in 1 month was: 450.000.000 meaning 1TB of logs written to disk on the node by CRIO, reading but the Red Hat log collector and stored in the Log Store.

Added to the impact on the performance, it have a financial impact for the storage needed.

Examples of logs are that adjust better to DEBUG and not as INFO:
```
/// For keep-alive pods are printed 4 messages per node each 10 seconds per node, in this example, the number of nodes is 71, then, this means 284 log entries per second, then 1704 log entries by minute and keepalive pod
$ oc logs keepalived-master.example-0 -c  keepalived-monitor |grep master.example-0|grep 2024-02-15T08:20:21 |wc -l

$ oc logs keepalived-master-example-0 -c  keepalived-monitor |grep worker-example-0|grep 2024-02-15T08:20:21 
2024-02-15T08:20:21.671390814Z time="2024-02-15T08:20:21Z" level=info msg="Searching for Node IP of worker-example-0. Using 'x.x.x.x/24' as machine network. Filtering out VIPs '[x.x.x.x x.x.x.x]'."
2024-02-15T08:20:21.671390814Z time="2024-02-15T08:20:21Z" level=info msg="For node worker-example-0 selected peer address x.x.x.x using NodeInternalIP"
2024-02-15T08:20:21.733399279Z time="2024-02-15T08:20:21Z" level=info msg="Searching for Node IP of worker-example-0. Using 'x.x.x.x' as machine network. Filtering out VIPs '[x.x.x.x x.x.x.x]'."
2024-02-15T08:20:21.733421398Z time="2024-02-15T08:20:21Z" level=info msg="For node worker-example-0 selected peer address x.x.x.x using NodeInternalIP"

/// For haproxy logs observed 2 logs printed per 6 seconds for each master, this means 6 messages in the same second, 60 messages/minute per pod
$ oc logs haproxy-master-0-example -c haproxy-monitor
...
2024-02-15T08:20:00.517159455Z time="2024-02-15T08:20:00Z" level=info msg="Searching for Node IP of master-example-0. Using 'x.x.x.x/24' as machine network. Filtering out VIPs '[x.x.x.x]'."
2024-02-15T08:20:00.517159455Z time="2024-02-15T08:20:00Z" level=info msg="For node master-example-0 selected peer address x.x.x.x using NodeInternalIP"

Version-Release number of selected component (if applicable):

OpenShift 4.14
VSphere IPI installation

How reproducible:

Always

Steps to Reproduce:

    1. Install OpenShift 4.14 Vsphere IPI environment
    2. Review the logs of the haproxy pods and keealived pods running in the namespace `openshift-vsphere-infra`

Actual results:

The pods haproxy-* and keepalived-* pods being so much verbose printing as INFO messages should be as DEBUG. 

Some of the messages are available in the Description of the problem in the present bug.

Expected results:

Printed as INFO only relevant messages helping to reduce the verbosity of the pods running in the namespace  `openshift-vsphere-infra`

Additional info:

blocks

OCPBUGS-32024 [4.15] Logs of runtimecfg node-ip detection too verbose

Closed

causes

OCPBUGS-32348 No ability to debug node-ip detection logic

Closed

is cloned by

OCPBUGS-32024 [4.15] Logs of runtimecfg node-ip detection too verbose

Closed

OCPBUGS-32027 Logs of keepalived too verbose

Closed

OCPBUGS-32028 Logs of haproxy too verbose

Closed

links to

[KCS] keepalived-monitor and haproxy in openshift-vsphere-infra namespace so much verbose in RHOCP 4

openshift/baremetal-runtimecfg#301: OCPBUGS-29919: Decrease log level when detecting node IP

RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update

(3 links to)

Assignee:: Mat Kowalski

Reporter:: Oscar Casal Sanchez

QA Contact:: Zhanqi Zhao

Votes:: 7 Vote for this issue

Watchers:: 17 Start watching this issue

Created:: 2024/02/26 11:05 AM

Updated:: 2024/07/03 9:27 PM

Resolved:: 2024/06/27 11:43 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

Hide