Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-5734

python3-file-magic segfaults when concurrently calling magic.detect_from_filename method

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Major Major
    • rhel-9.4
    • rhel-9.2.0
    • file
    • file-5.39-15.el9
    • Major
    • sst_cs_plumbers
    • ssg_core_services
    • 26
    • 2
    • QE ack, Dev ack
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:
      Concurrently calling magic.detect_from_filename(..) by different threads, even on independent files, does often end up in segfault.

      The segfault is not observed on RHEL8 (python3-magic-5.33-24.el8.noarch), it sometimes happens on RHEL9.1 (python3-file-magic-5.39-12.el9.noarch.rpm as well as python3-file-magic-5.39-14.el9.noarch.rpm), and VERY often on Gemini kernels of RHEL9.3 beta (not sure if Gemini or 9.3 affects the frequency).

      Version-Release number of selected component (if applicable):
      python3-file-magic-5.39-12.el9.noarch

      How reproducible:
      20% to 100%

      Steps to Reproduce:
      1. Have this script:

      from concurrent.futures import ThreadPoolExecutor
      import os
      import magic

      jobs = 4
      report_paths = ['/var/tmp/dir1', '/var/tmp/dir2']

      def obfuscate_report(archive):
      print(archive)
      for dirname, dirs, files in os.walk(archive):
      for filename in files:
      print(f" filename=

      {filename}

      ")
      _fname = os.path.join(dirname, filename.lstrip('/'))
      print(f"{_fname}:

      {magic.detect_from_filename(_fname)}

      ")
      return

      print("===")
      pool = ThreadPoolExecutor(jobs)
      pool.map(obfuscate_report, report_paths, chunksize=1)
      pool.shutdown(wait=True)
      print("===")

      2. Generate 100 text files in the two directories (each directory content will be examined by magic.detect_from_filename in separate thread):

      rm -rf /var/tmp/dir1 /var/tmp/dir2
      mkdir /var/tmp/dir1
      for i in $(seq 1 100); do date > /var/tmp/dir1/date.${i}.txt; done
      cp -r /var/tmp/dir1 /var/tmp/dir

      3. Run the script:

      1. python3 segfault_reproducer.py

      Actual results:

      1. python3 segfault_reproducer.py
        ===
        /var/tmp/dir1
        filename=date.1.txt
        /var/tmp/dir2
        filename=date.51.txt
        ===
        tcache_thread_shutdown(): unaligned tcache chunk detected
        Aborted (core dumped)
        #

      (depending on RHEL/kernel version, the segfault might not always happen; it must be a race condition)

      Expected results:
      No segfault

      Additional info:
      Our real use case: sos / sosreport has a feature to obfuscate customer sensitive data in provided sosreport(s). That requires detection of file types (treat binary files differently than text files). When running the cleaner concurrently on multiple sosreports, we use ThreadPoolExecutor for the concurrency. And hit the segfaults with backtraces pointing to magic library.

      Providing sosreports is very important for Red Hat support, so segfaults like these slows down investigation of support cases. Thus we treat this BZ with high priority.

            kvolny Karel Volný
            rhn-support-pmoravec Pavel Moravec
            Vincent Mihalkovic Vincent Mihalkovic
            Karel Volný Karel Volný
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: