Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-7021

Fluentd stack trace when advancing the read pointer into the journal by one entry

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Undefined Undefined
    • None
    • Logging 5.8.12
    • Log Collection
    • False
    • Hide

      None

      Show
      None
    • False
    • NEW
    • NEW
    • Bug Fix
    • Moderate

      Description of problem:

      Collector pods using Fluentd started to be restarted and observed a Ruby Stack trace. The ruby stacktrace error is:

      2025-04-15T18:31:16.958263928Z /usr/share/gems/gems/systemd-journal-1.4.2/lib/systemd/journal/naviga
      ble.rb:43: [BUG] Bus Error at 0x00007fc72c400098
      2025-04-15T18:31:16.958263928Z ruby 3.1.5p252 (2024-04-23 revision 1945f8dc0e) [x86_64-linux]
      2025-04-15T18:31:16.958263928Z 
      2025-04-15T18:31:16.958263928Z -- Control frame information ----------------------------------------
      -------
      2025-04-15T18:31:16.958263928Z c:0011 p:---- s:0050 e:000049 CFUNC  :sd_journal_next
      2025-04-15T18:31:16.958361572Z c:0010 p:0014 s:0045 e:000044 METHOD /usr/share/gems/gems/systemd-jou
      rnal-1.4.2/lib/systemd/journal/navigable.rb:43
      2025-04-15T18:31:16.958415863Z c:0009 p:0024 s:0040 e:000039 METHOD /usr/share/gems/gems/fluent-plug
      in-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:144
      2025-04-15T18:31:16.958451428Z c:0008 p:0032 s:0034 e:000033 METHOD /usr/share/gems/gems/fluent-plug
      in-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:121 [FINISH]
      2025-04-15T18:31:16.958487130Z c:0007 p:---- s:0030 e:000029 IFUNC 
      2025-04-15T18:31:16.958555775Z c:0006 p:0012 s:0027 e:000026 METHOD /usr/share/gems/gems/fluentd-1.1
      6.2/lib/fluent/plugin_helper/timer.rb:80 [FINISH]
      2025-04-15T18:31:16.958590436Z c:0005 p:---- s:0022 e:000021 CFUNC  :run_once
      2025-04-15T18:31:16.958676911Z c:0004 p:0041 s:0017 e:000016 METHOD /usr/share/gems/gems/cool.io-1.8
      .0/lib/cool.io/loop.rb:88
      2025-04-15T18:31:16.958748772Z c:0003 p:0033 s:0012 e:000011 BLOCK  /usr/share/gems/gems/fluentd-1.16.2/lib/fluent/plugin_helper/event_loop.rb:93
      2025-04-15T18:31:16.958823203Z c:0002 p:0080 s:0008 e:000007 BLOCK  /usr/share/gems/gems/fluentd-1.16.2/lib/fluent/plugin_helper/thread.rb:78 [FINISH]
      2025-04-15T18:31:16.958860393Z c:0001 p:---- s:0003 e:000002 (none) [FINISH]
      2025-04-15T18:31:16.958935672Z 

      Version-Release number of selected component (if applicable):

      $ oc get csv|grep -i logging
      cluster-logging.v5.8.12          Red Hat OpenShift Logging          5.8.12    cluster-logging.v5.8.5          Succeeded 

      Complete stacktrace will be provided.

      How reproducible:

      Not reproducible

      Additional info:

      1.Two of the 3 nodes analyzed have a high cpu load

        LoadAvg:   [16 CPU] 47.09 (294%), 61.41 (384%), 59.69 (373%)
      

      2.Fluentd has low requests.cpu and limits.cpu:

      spec:
        collection:
          logs:
            fluentd:
              resources:
                limits:
                  cpu: 500m
                  memory: 4Gi
                requests:
                  cpu: 500m
                  memory: 2Gi
      

      3.Appdynamics Splunk software is installed.

              Unassigned Unassigned
              rhn-support-ocasalsa Oscar Casal Sanchez
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: