Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-5839

builder: its-hub 0.3.1 package update request

    • its-hub 0.3.1 package update
    • False
    • Hide

      None

      Show
      None
    • False
    • To Do
    • AIPCC-5898 - build its-hub wheels
    • AIPCC-5898build its-hub wheels
    • 0% To Do, 0% In Progress, 100% Done

      Installation Instruction

      1. CPU installation (default)
        pip install its-hub
      1. CUDA installation (with vLLM support)
        pip install 'its-hub[vllm]'{}

      Requested Package Name and Version

       

      dependencies = [
          "openai>=1.75.0",
          "tqdm>=4.65.0",
          "typing-extensions>=4.0.0",
          "reward-hub==0.1.5",
          "transformers==4.53.2",  # Pin to exact version that worked in CI to avoid aimv2 config conflict with vLLM 0.9.1
          "backoff>=2.2.0",
          "click>=8.1.0",
          "fastapi>=0.115.0",
          "uvicorn<0.30.0",
          "pydantic>=2.0.0",
          "numpy>=1.24.0",
          "requests>=2.28.0",
          "aiohttp>=3.8.0",
          "litellm>=1.0.0",
      ]

      Optional extras (for CUDA builds):

      pip install 'its-hub[vllm]'

      Repository:
      🔗 https://github.com/Red-Hat-AI-Innovation-Team/its_hub
      📦 Release v0.3.1


      Brief Explanation for Request

       

      New updates to its-hub package have been released and we would like to update the package build [both CPU and CUDA build] to new version in the RH PyPi Mirror Index(ices). Please refer to parent issue for original build request.

      Summary of Changes

      • Resolved CPU build issues: vLLM has been made optional, removing the GPU dependency barrier for CPU environments.
      • Improved installation flexibility:
        • CPU build: pip install its-hub
        • CUDA build: pip install 'its-hub[vllm]'
      • Dependencies cleaned up: Removed accelerate and reduced non-essential packages.
      • Enhanced functionality:
        • Added Inference-Time-Scaling (ITS) as a configurable API endpoint.
        • Updated notebooks to demonstrate ITS self-consistency algorithms.

      These changes ensure broader deployability of ITS-Hub in enterprise and on-prem environments where GPU hardware may not be available.


      QE User Acceptance Tests

      Objective:
      Validate that inference-time-scaling (ITS) works consistently across both CPU and CUDA builds, with correct accuracy-compute tradeoff behavior.

      Testing Focus:

      • ITS inference accuracy scaling via API endpoint
      • Compatibility with FastAPI and litellm
      • Correct operation in CPU-only mode (no vLLM dependency)
      • Optional CUDA path validation with [vllm] extras

      Expected Outcome:
      ITS endpoints should perform deterministically and allow users to adjust compute scaling during inference for improved accuracy.


      Package License

      License: Apache License 2.0
      Compliance: Approved per Fedora Allowed Licenses List

       

       

              rh-ee-alustosa Andre Lustosa Cabral de Paula Motta
              rh-ee-gxxu GX Xu
              Doug Hellmann
              Frank's Team
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: