Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-10921

Add autogluon-tabular into the RHAI pipeline onboarding collection

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      Add package 'autogluon-tabular' into the RHAI pipeline onboarding collection.

      The package requires builder repository onboarding before it can be added to the RHAI pipeline. This ticket is blocked by the builder onboarding ticket.

      Summary

      Here is the executive summary formatted as a JIRA comment in JIRA wiki markup:

      Executive Summary: autogluon-tabular Packaging Analysis

      autogluon-tabular is a pure-Python package (complexity: 1/10) distributed as a universal py3-none-any wheel, meaning no native compilation is required. The package is licensed under Apache-2.0, which is fully compatible with Red Hat distribution policies. The target version for onboarding is 1.5.0, supporting Python 3.10–3.12 (Python 3.13 has known issues; 3.14 is unsupported). All four sibling AutoGluon packages (autogluon-common, autogluon-core, autogluon-features, autogluon-tabular) must be built and pinned to the exact same version (==1.5.0).

      The critical blocker for source builds is that the PyPI sdist cannot be built standalone — it is missing _setup_utils.py and the root VERSION file required by setup.py. This is inherent to the monorepo architecture. Source builds must be performed from a full clone of the autogluon/autogluon monorepo (tag v1.5.0), with the environment variable RELEASE=1 set, building subpackages in dependency order: common → core → features → tabular. Since this is a pure-Python package, the pre-built PyPI wheel is byte-identical to what a source build produces and is a viable fallback strategy if source rebuilding is not strictly required.

      For CPU-only deployment (AIPCC-10919), the base package plus the lightgbm, catboost, and xgboost extras provide full CPU-optimized ML model support. Avoid the [all] extra, as it pulls in PyTorch (>2GB) via fastai, tabm, and mitra dependencies. Key runtime dependencies are mainstream and well-packaged: numpy, scipy, pandas, scikit-learn, and networkx. There are no system library requirements for the core package.

      Actionable next steps: (1) Clone the monorepo at tag v1.5.0 and build all four sibling packages with RELEASE=1; (2) install with CPU-focused extras: pip install autogluon-tabular[lightgbm,catboost,xgboost]==1.5.0; (3) be aware of open upstream issues — notably #5160 (CPU oversubscription causing 100x slowdown, mitigated by configuring thread limits) and #5521 (fastai import error, already capped in setup.py). No compilation blockers exist for this package.

      Build Commands
      git clone --branch v1.5.0 https://github.com/autogluon/autogluon.git
      cd autogluon
      export RELEASE=1
      for pkg in common core features tabular; do
        cd $pkg {{SUMMARY_CONTENT}}{{SUMMARY_CONTENT}} python setup.py sdist bdist_wheel {{SUMMARY_CONTENT}}{{SUMMARY_CONTENT}} cd ..
      done
      # Output: tabular/dist/autogluon_tabular-1.5.0-py3-none-any.whl
      

              epacific@redhat.com Einat Pacifici
              aipcc-jira-bot@redhat.com AIPCC JIRABOT
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: