Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-85054

sd_journal_next() returns Bad message when the journal rotates often

Linking RHIVOS CVEs to...Migration: Automation ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • rhel-8.6.0, rhel-8.8.0, rhel-8.10
    • systemd
    • None
    • No
    • Moderate
    • rhel-systemd
    • ssg_core_services
    • 5
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • Red Hat Enterprise Linux
    • None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • None

      Some customers using the persistent journal see rsyslogd returning the following message just after the journals rotated:

      Mar 26 14:12:32 ... rsyslogd[917]: imjournal: sd_journal_next() failed: Bad message [v8.2102.0-15.el8_10.1]
      

      This is due to a race when reading/rotating the journal, the race occurs randomly, usually after 100-500 rotations.

      This leads rsyslogd to re-read all the journal from the beginning in order to not lose messages, leading to having duplicates in /var/log/messages and such.

      I could reproduce the issue on 8.6, 8.8, 8.9 and 8.10, including latest systemd systemd-239-82.el8_10.4.x86_64 and latest rsyslog rsyslog-8.2102.0-15.el8_10.1.x86_64.

      Reproducer

      1. Configure a QEMU/KVM with 4GB memory and 4 CPUs
      2. Configure the persistent journal and make it rotate quickly and rsyslog to not drop messages
        # cat /etc/systemd/journald.conf 
        [Journal]
        RateLimitIntervalSec=0
        RateLimitBurst=0
        SystemMaxFiles=30
        SystemMaxUse=200M
        
        # cat /etc/rsyslog.conf
        [...]
        module(load="imjournal"
               ratelimit.interval="0"
               ratelimit.burst="0"
               StateFile="imjournal.state")
        [...]
        
      3. Create 20 users
        # for i in $(seq 1000 1019); do useradd user$i; echo redhat | passwd --stdin user$i; done
        
      4. Make sure root can ssh to the users password-less
      5. Make the users generate logs through ssh'ing locally in loop
        # for i in $(seq 1000 1019); do while :; do ssh user$i@localhost true </dev/null; done &
        
      6. Monitor the journal for rsyslogd
        # journalctl --follow -u rsyslog
        [...]
        Mar 26 14:12:32 ... rsyslogd[917]: imjournal: journal files changed, reloading...  [v8.2102.0-15.el8_10.1 try https://www.rsyslog.com/e/0 ]
        Mar 26 14:12:32 ... rsyslogd[917]: imjournal: journal files changed, reloading...  [v8.2102.0-15.el8_10.1 try https://www.rsyslog.com/e/0 ]
        Mar 26 14:12:32 ... rsyslogd[917]: imjournal: sd_journal_next() failed: Bad message [v8.2102.0-15.el8_10.1]
        Mar 26 14:12:32 ... rsyslogd[917]: imjournal: sd_journal_next() failed: Bad message [v8.2102.0-15.el8_10.1]
        Mar 26 14:12:32 ... rsyslogd[917]: imjournal: trying to recover from journal error [v8.2102.0-15.el8_10.1 try https://www.rsyslog.com/e/0 ]
        

              msekleta@redhat.com Michal Sekletar
              rhn-support-rmetrich Renaud Métrich
              systemd maint mailing list systemd maint mailing list
              Frantisek Sumsal Frantisek Sumsal
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: