Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-4312

Investigate 'builder' bootstrap failures for recently updated packages

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • Wheel Package Index
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • Low

      We are observing intermittent failures in the bootstrap stage of the wheels builder pipeline. The failure seems to be triggered when a dependency has a new version released on PyPI during the build process (although we're not certain).

      The error message follows a consistent pattern:

      ERROR: could not handle toplevel dependency <package> (<version>) because Trying to add setuptools==80.8.0 to parent <package>==<version> but <package>==<version> does not exist 

      This suggests that the build process identifies a package version but then fails to find it later in the dependency resolution graph.

      Observed Instances

      This issue has been observed with at least two different packages:

      Tasks

      1. Analyze Logs: Perform a detailed analysis of the debug logs for the failed jobs to trace the sequence of events for the boto3 and docling packages.
      2. Root Cause Analysis: Determine why a package version that is initially discovered later fails validation with a "does not exist" error.
      3. Develop a Fix: Propose and implement a solution to make the build process more resilient to newly released or updated packages on PyPI.

      Notes

      • The leading hypothesis is that there's a race condition. The build may be fetching metadata at different points, and if a package is updated on PyPI in between these steps, it could lead to inconsistent state.
      • However, the bootstrap job runs in a single thread, which should mitigate simple race conditions. It's possible the issue lies in how package versions are selected and then re-verified, or perhaps in an interaction with a caching layer or mirror that experiences replication delays.
      • When working on a solution, we should also consider edge cases where a package gets "yanked" from PyPI, and if possible, ensure the build knows how to handle such situations as well.

              Unassigned Unassigned
              rh-ee-myochpaz Michael Yochpaz
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: