Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-20020

Performance regression in bash with large numbers of environment variables

    • None
    • Moderate
    • rhel-sst-cs-plumbers
    • ssg_core_services
    • 5
    • False
    • Hide

      None

      Show
      None
    • No
    • None
    • Unspecified Release Note Type - Unknown
    • All
    • None

      What were you trying to do that didn't work?

      One of our customer in a supporrt case is facing performance issue in his application once they have tested their product on RHEL 9 , compared to CentOS 7
      On RHEL 9 the bash process is consuming high amount of CPU and is taking more time to complete the same task it used to do on CentOS 7.

      They were further able to isolate the cause to the changes which happened in bash-5.0 and the issue is solved in bash-5.2 according to customer tests
      They found that this happens because they are using large number of "environment variables".

      They say that the particular change that's affecting them is a change to increment SHLVL when executing a command in a pipe, which causes bash to rebuild its array of exported environment variables every time that happens. In their case we have ~1.4k such variables so rebuilding the array is a nontrivial amount of work (partly because bash does a bunch of seemingly unnecessary deduplication checking).

      See the following for discussion of this issue in the bash mailing lists.

      The relevant code is visible at https://github.com/bminor/bash/blob/bash-5.1/execute_cmd.c#L5487.
      They have checked that making flipping that "#if 0" to "#if 1" makes the performance comparable to bash 4.4 or 5.2.

      Test perform by customer on their systems
      ~~~
      for exe in bash-*; do duration=$(/usr/bin/time -f "%Us" ./$exe perf-test.sh 1000 2>&1); echo "$exe: $duration"; done
      bash-4.2: 1.67s
      bash-4.4: 1.58s
      bash-5.0: 15.06s
      bash-5.1: 14.61s
      bash-5.2: 1.66s
      ~~~

      Customer have provided a reproducer script which can be used to test the performance of the time taken to run bash.
      See the attached bash-perf-repro.tgz file for the repro scripts. This contains 3 files,

      • perf-test.sh to run some commands in pipeline in a loop
      • env.sh to set the environment variables
      • run-test.sh to launch perf-test.sh in a clean environment and time it.

      To use run ./run-test.sh <bash_exe> <iterations>, where <bash_exe> is the bash executable to test and <iterations> is the number of times perf-test.sh should launch a pipeline. e.g. to run a test with 1000 iterations using the bash executable on the path run ./run-test.sh bash 1000.

      I had tested it on different individual installation of RHEL 7 (bash-4.2) , RHEL 8 (bash-4.4) and RHEL 9 (bash-5.1) having the same resource configuration.
      I see that the old version of RHEL 7 and RHEL 8 ver much faster while the RHEL 9 does take significantly more time.
      I tested by downloading the latest "bash-5.2" from upstream "https://ftp.gnu.org/gnu/bash/bash-5.2.tar.gz" and can see that version 5.2 does have improvement to performance.

      RHEL 7
      ~~~
      ./run-test.sh /bin/bash 1000
      Red Hat Enterprise Linux Server release 7.9 (Maipo)
      4.2.46(2)-release

      real 0m2.177s
      user 0m0.718s
      sys 0m1.241s
      ~~~

      RHEL 8
      ~~~
      ./run-test.sh /bin/bash 1000
      Red Hat Enterprise Linux release 8.9 (Ootpa)
      4.4.20(1)-release

      real 0m3.565s
      user 0m1.561s
      sys 0m1.996s
      ~~~

      RHEL 9
      ~~~
      ./run-test.sh /bin/bash 1000
      Red Hat Enterprise Linux release 9.3 (Plow)
      5.1.8(1)-release

      real 0m7.609s
      user 0m6.494s
      sys 0m1.318s
      ~~~

      RHEL 9 (with bash version 5.2 from upstream)
      ~~~
      ./run-test.sh /usr/local/build/bash-5.2/bash 1000
      Red Hat Enterprise Linux release 9.3 (Plow)
      5.2.0(1)-release

      real 0m2.137s
      user 0m1.183s
      sys 0m1.196s
      ~~~

      Additional information provided by rhn-support-pandrade  about this issue.

      Problem in bash 5.1: https://github.com/bminor/bash/blame/bash-5.1/execute_cmd.c#L5487
      No problem in bash 5.2: https://github.com/bminor/bash/blame/bash-5.2/execute_cmd.c#L5636

      Please provide the package NVR for which bug is seen:

      ~~~
      rpm -q bash
      bash-5.1.8-6.el9_1.x86_64
      echo $BASH_VERSION
      5.1.8(1)-release
      ~~~

      How reproducible:

      Every time on bash 5.1 on RHEL 9

      Steps to reproduce

      1.  Download and extract the "bash-repro.tgz"
      2.  To use run ./run-test.sh <bash_exe> <iterations>, where <bash_exe> is the bash executable to test and <iterations> is the number of times perf-test.sh should launch a pipeline. e.g. to run a test with 1000 iterations using the bash executable on the path run ./run-test.sh bash 1000.

      Expected results

      Bash is taking a long time on bash 5.1 on RHEL 9 when executing the script with large number of environment variables.

      Actual results

      It should complete with time comparable to old version of bash 4.4 form RHEL 8 and bash 4.2 from RHEL 7

              rhn-support-svashish Siteshwar Vashisht
              rhn-support-amepatil Ameya Patil
              Siteshwar Vashisht Siteshwar Vashisht
              Karel Volný Karel Volný
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: