Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-10920

Onboard autogluon-tabular into the AIPCC Builder

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      Package 'autogluon-tabular' does not build as-is via the AIPCC self-service pipeline and requires builder repository onboarding.

      Build Failure Summary

      Root Cause Analysis: `autogluon-tabular` Build Failure

      What Happened

      The build of autogluon-tabular 1.5.0 failed during the `get_requires_for_build_wheel` phase — the very early step where setuptools executes `setup.py` to determine what additional build dependencies are needed.

      Root Cause: Missing `_setup_utils.py` in the sdist

      The error is a straightforward `FileNotFoundError`:

      FileNotFoundError: [Errno 2] No such file or directory:
        '/mnt/work-dir/autogluon_tabular-1.5.0/autogluon_tabular-1.5.0/_setup_utils.py'
      

      The `setup.py` for `autogluon-tabular` imports or executes a helper module called `_setup_utils.py` (at line 18), but this file is not included in the sdist tarball (`autogluon_tabular-1.5.0.tar.gz`).

      Why This Happens

      AutoGluon is a monorepo project — it contains multiple sub-packages (`autogluon.core`, `autogluon.tabular`, `autogluon.features`, etc.) in a single repository. The `_setup_utils.py` file lives at the root of the monorepo and is shared across all sub-packages' `setup.py` files. When PyPI sdists are built for each individual sub-package, `_setup_utils.py` is outside the sub-package directory and therefore not included in the tarball.

      This is a packaging defect upstream in the `autogluon-tabular` sdist on PyPI.

      How to Fix It

      • Patch the sdist: Add a `_setup_utils.py` file into the source tree before the build runs. The file can be obtained from the [AutoGluon GitHub repository](https://github.com/autogluon/autogluon) at the tag matching version 1.5.0 (it will be at the repo root: `_setup_utils.py`). A fromager source override or patch that injects this file into the extracted sdist directory would resolve the build failure.
      • Report upstream: This is a known class of issue with monorepo projects. The upstream project should include `_setup_utils.py` in the sdist's `MANIFEST.in` (or equivalent `pyproject.toml` configuration) so that the sdist is self-contained and buildable without the rest of the monorepo.

      Packaging Analysis Summary

      Here is the executive summary formatted as a JIRA comment in JIRA wiki markup:

      Executive Summary: autogluon-tabular Packaging Analysis

      autogluon-tabular is a pure-Python package (complexity: 1/10) distributed as a universal py3-none-any wheel, meaning no native compilation is required. The package is licensed under Apache-2.0, which is fully compatible with Red Hat distribution policies. The target version for onboarding is 1.5.0, supporting Python 3.10–3.12 (Python 3.13 has known issues; 3.14 is unsupported). All four sibling AutoGluon packages (autogluon-common, autogluon-core, autogluon-features, autogluon-tabular) must be built and pinned to the exact same version (==1.5.0).

      The critical blocker for source builds is that the PyPI sdist cannot be built standalone — it is missing _setup_utils.py and the root VERSION file required by setup.py. This is inherent to the monorepo architecture. Source builds must be performed from a full clone of the autogluon/autogluon monorepo (tag v1.5.0), with the environment variable RELEASE=1 set, building subpackages in dependency order: common → core → features → tabular. Since this is a pure-Python package, the pre-built PyPI wheel is byte-identical to what a source build produces and is a viable fallback strategy if source rebuilding is not strictly required.

      For CPU-only deployment (AIPCC-10919), the base package plus the lightgbm, catboost, and xgboost extras provide full CPU-optimized ML model support. Avoid the [all] extra, as it pulls in PyTorch (>2GB) via fastai, tabm, and mitra dependencies. Key runtime dependencies are mainstream and well-packaged: numpy, scipy, pandas, scikit-learn, and networkx. There are no system library requirements for the core package.

      Actionable next steps: (1) Clone the monorepo at tag v1.5.0 and build all four sibling packages with RELEASE=1; (2) install with CPU-focused extras: pip install autogluon-tabular[lightgbm,catboost,xgboost]==1.5.0; (3) be aware of open upstream issues — notably #5160 (CPU oversubscription causing 100x slowdown, mitigated by configuring thread limits) and #5521 (fastai import error, already capped in setup.py). No compilation blockers exist for this package.

      Build Commands
      git clone --branch v1.5.0 https://github.com/autogluon/autogluon.git
      cd autogluon
      export RELEASE=1
      for pkg in common core features tabular; do
        cd $pkg {{ANALYSIS_SUMMARY_CONTENT}}{{ANALYSIS_SUMMARY_CONTENT}} python setup.py sdist bdist_wheel {{ANALYSIS_SUMMARY_CONTENT}}{{ANALYSIS_SUMMARY_CONTENT}} cd ..
      done
      # Output: tabular/dist/autogluon_tabular-1.5.0-py3-none-any.whl
      

              epacific@redhat.com Einat Pacifici
              aipcc-jira-bot@redhat.com AIPCC JIRABOT
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: