Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-78090

Techpreview e2e tests failing with "GLIBC_2.38 not found" - oc binary incompatibility with RHCOS

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Problem

      Multiple techpreview e2e tests are failing due to GLIBC version incompatibility between the oc binary and the RHCOS nodes running the tests.

      Error

      /tmp/oc: /lib64/libc.so.6: version {{GLIBC_2.38' not found
      

      Impact

      This affects 3 out of 4 recent failures (75%) of the CSI snapshot test:

      Test Name: }}[sig-storage] CSI Volumes [Driver: csi-hostpath] [Testpattern: Dynamic PV (default fs)] provisioning should provision storage with snapshot data source [Feature:VolumeSnapshotDataSource]{{

      Test ID: }}openshift-tests:8b8197a4f0cad603eb36ddcc190c0040{{

      Affected Variants:

      • GCP HA minor upgrade (2 failures)
      • GCP HA standard (1 failure)
      • Azure HA micro upgrade (1 failure)

      Test Health:

      • Overall pass rate: 99.50% (3176 successes, 16 failures)
      • Without this GLIBC issue, pass rate would be ~99.90%

      Root Cause

      The oc binary located at }}/tmp/oc{{ was compiled/linked against GLIBC 2.38, but RHCOS 9.x nodes have GLIBC 2.34. When the test framework attempts to execute oc commands, the dynamic linker fails to find the required GLIBC symbols, causing test execution to fail.

      GLIBC Version Reference:

      • RHEL/RHCOS 9.x: GLIBC 2.34
      • RHEL/RHCOS 10.x: GLIBC 2.39
      • GLIBC 2.38: Fedora 38+, Ubuntu 23.10+

      Expected Behavior

      The oc binary used in e2e tests must be compatible with the RHCOS version being tested:

      • For RHCOS 9.x: oc must be built against GLIBC 2.34 or earlier
      • For RHCOS 10.x: oc can use GLIBC 2.39

      Investigation Questions

      1. Where does }}/tmp/oc` come from in these test runs?
        • Is it downloaded from a release payload?
        • Built as part of test setup?
        • Copied from a test container image?
      2. Why does this only affect certain variants (GCP, Azure) and not others (AWS, vSphere)?
        • Different test images?
        • Different download sources?
        • Different job configurations?
      3. Was there a recent change to:
        • The build system for test binaries (moved to Fedora 38+ or RHEL 10)?
        • The source of the oc binary in tests?
        • The test framework setup?

      Important Note

      This is NOT a Storage or CSI product bug. The CSI snapshot functionality itself works correctly. The test fails only because of the binary compatibility issue in the test infrastructure.

      Recommended Fix

      Ensure the oc binary used in tests is built on/for the target RHCOS version:

      • Use the oc binary from the cluster being tested (already compatible)
      • Download oc from the correct release payload for the target version
      • Build test binaries on RHEL 9 for RHCOS 9 tests
      • Add version compatibility checks in test setup

              nmoraiti Nikolaos Moraitis
              rhn-support-jgeorge John George
              Nikolaos Moraitis Nikolaos Moraitis
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: