Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-52397

MAC cleaner obfuscates UUIDs, what makes it terribly slow

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • rhel-8.10, rhel-9.4
    • sos
    • None
    • No
    • None
    • rhel-sst-cee-supportability
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None

      What were you trying to do that didn't work?

      sos clean running on Satellite is terribly slow. E.g. it takes three days to complete.
      Narrowing down the problem, the cause is https://github.com/sosreport/sos/blob/main/sos/cleaner/parsers/mac_parser.py#L23-L26 where MAC cleaner treats UUIDs in strings like:

      /pulp/api/v3/content/rpm/packages/0190fe17-d1a6-7066-aa33-e7422232031f/
      

      as MAC addresses. When postgres logs have 20k+ such UUIDs (I noticed 50k on one line, even), then processing one file is terribly slow as all tens of thousands of "MAC addresses" are obfuscated.

      The performance penalty is cruel (quadratic to number of UUIDs in one file), but main problem is we should not treat UUIDs as MAC addresses at all.

      The problem is reported in upstream as https://github.com/sosreport/sos/issues/3736

      Please provide the package NVR for which bug is seen:

      sos-4.7.1 (but any)

      How reproducible:

      100%

      Steps to reproduce

      Artificial reproducer:

      rm -f /var/log/qpidd.log /etc/sos/cleaner/default_mapping

      for i in $(seq 1 10000); do echo "/pulp/api/v3/content/rpm/packages/$(uuidgen)/"; done | tr '\n' ',' >> /var/log/qpidd.log

      time sos report -o qpid --batch --build --clean -vvv

      Check generated /etc/sos/cleaner/default_mapping and also obfuscated UUIDs in the URIs in the mocked /var/log/qpidd.log file (in the collected sosreport).

      Then, regenerate the qpidd.log file with 20k or 30k or .. words and re-run sos report cleaner to see how performance degrades.

      Expected results

      no UUID is treated as MAC address, the cleaner runs in a reasonable small time (at most seconds for such file)

      Actual results

      cleaner's mapping has:

          "mac_map": {
              "8b9da:003b:4d63:99a6": "53:4f:53:06:59:91",
              "f53:4f:53:06:59:91:ac": "53:4f:53:1d:44:da",
              "403c2:e013:4601:a15a": "53:4f:53:1b:51:f2",
      ..
      

      for each and every UUID in the log file.

      for 10k words on one line, "Obfuscating var/log/qpidd.log" took 16s (see "sos_logs/sos.log")
      for 20k words on the line, it took 1m5s
      for 30k words on the line, it took 2m32s

              rhn-support-pmoravec Pavel Moravec
              rhn-support-pmoravec Pavel Moravec
              Pavel Moravec Pavel Moravec
              RHEL Supportability QE Bot RHEL Supportability QE Bot
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: