Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-29919

Logs of runtimecfg node-ip detection too verbose

XMLWordPrintable

    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None

      ==== This Jira covers only baremetal-runtimecfg component with respect to node IP detection ====

      Description of problem:

      Pods running in the namespace openshift-vsphere-infra are so much verbose printing as INFO messages that should debug.
      
      This excesse of verbosity has an impact in CRIO, in the node and also in the Logging system. 
      
      For instance, having 71 nodes, the number of logs coming from this namespace in 1 month was: 450.000.000 meaning 1TB of logs written to disk on the node by CRIO, reading but the Red Hat log collector and stored in the Log Store.
      
      Added to the impact on the performance, it have a financial impact for the storage needed.
      
      Examples of logs are that adjust better to DEBUG and not as INFO:
      ```
      /// For keep-alive pods are printed 4 messages per node each 10 seconds per node, in this example, the number of nodes is 71, then, this means 284 log entries per second, then 1704 log entries by minute and keepalive pod
      $ oc logs keepalived-master.example-0 -c  keepalived-monitor |grep master.example-0|grep 2024-02-15T08:20:21 |wc -l
      
      $ oc logs keepalived-master-example-0 -c  keepalived-monitor |grep worker-example-0|grep 2024-02-15T08:20:21 
      2024-02-15T08:20:21.671390814Z time="2024-02-15T08:20:21Z" level=info msg="Searching for Node IP of worker-example-0. Using 'x.x.x.x/24' as machine network. Filtering out VIPs '[x.x.x.x x.x.x.x]'."
      2024-02-15T08:20:21.671390814Z time="2024-02-15T08:20:21Z" level=info msg="For node worker-example-0 selected peer address x.x.x.x using NodeInternalIP"
      2024-02-15T08:20:21.733399279Z time="2024-02-15T08:20:21Z" level=info msg="Searching for Node IP of worker-example-0. Using 'x.x.x.x' as machine network. Filtering out VIPs '[x.x.x.x x.x.x.x]'."
      2024-02-15T08:20:21.733421398Z time="2024-02-15T08:20:21Z" level=info msg="For node worker-example-0 selected peer address x.x.x.x using NodeInternalIP"
      
      /// For haproxy logs observed 2 logs printed per 6 seconds for each master, this means 6 messages in the same second, 60 messages/minute per pod
      $ oc logs haproxy-master-0-example -c haproxy-monitor
      ...
      2024-02-15T08:20:00.517159455Z time="2024-02-15T08:20:00Z" level=info msg="Searching for Node IP of master-example-0. Using 'x.x.x.x/24' as machine network. Filtering out VIPs '[x.x.x.x]'."
      2024-02-15T08:20:00.517159455Z time="2024-02-15T08:20:00Z" level=info msg="For node master-example-0 selected peer address x.x.x.x using NodeInternalIP"
      
      

      Version-Release number of selected component (if applicable):

      OpenShift 4.14
      VSphere IPI installation

      How reproducible:

      Always

      Steps to Reproduce:

          1. Install OpenShift 4.14 Vsphere IPI environment
          2. Review the logs of the haproxy pods and keealived pods running in the namespace `openshift-vsphere-infra`
          

      Actual results:

      The pods haproxy-* and keepalived-* pods being so much verbose printing as INFO messages should be as DEBUG. 
      
      Some of the messages are available in the Description of the problem in the present bug.

      Expected results:

      Printed as INFO only relevant messages helping to reduce the verbosity of the pods running in the namespace  `openshift-vsphere-infra`

      Additional info:

          

            mkowalsk@redhat.com Mat Kowalski
            rhn-support-ocasalsa Oscar Casal Sanchez
            Zhanqi Zhao Zhanqi Zhao
            Votes:
            7 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated: