-
Story
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
False
-
-
False
-
-
Package 'rapidfuzz' does not build as-is via the AIPCC self-service pipeline and requires builder repository onboarding.
Build Failure Summary
Root Cause Analysis: `rapidfuzz` Build Failure
Summary
The build never reached the compilation stage for `rapidfuzz`. It failed during the requirements preparation phase due to an invalid version specifier in the requirements file.
Root Cause
The file `/collection-repository/collections/torch-2.9.0/cpu-ubi9/requirements.txt` contains the line:
rapidfuzz==any
The version string `any` is not a valid PEP 440 version specifier. The `packaging` library's requirement parser rejects it:
packaging._tokenizer.ParserSyntaxError: Expected semicolon (after name with no version specifier) or end
rapidfuzz==any
^
The `prepare-requirements` tool calls `Requirement("rapidfuzz==any")`, which raises an `InvalidRequirement` exception, causing the entire pipeline to abort at line 0 of the requirements file.
Fix
Replace `rapidfuzz==any` in the requirements file with a valid specifier:
- To allow any version (unpinned): simply use `rapidfuzz` with no version operator
- To pin to a specific version: use a valid PEP 440 version, e.g. `rapidfuzz==3.12.2`
- To set a range: use standard operators, e.g. `rapidfuzz>=3.0.0,<4.0.0`
For example, edit `/collection-repository/collections/torch-2.9.0/cpu-ubi9/requirements.txt` and change:
* rapidfuzz==any + rapidfuzz
or pin to a concrete version if reproducibility is required:
* rapidfuzz==any + rapidfuzz==3.12.2
Key Details
- Failure location: `prepare-requirements` step, before any wheel building begins
- Failing tool: `package_plugins.cli.prepare_requirements_constraints:parse_requirements_file` (line 44-46)
- Input file: `/collection-repository/collections/torch-2.9.0/cpu-ubi9/requirements.txt`, line 0
- The `==any` syntax appears to be a placeholder or convention from another system that is not compatible with PEP 440 / Python `packaging` library parsing
Packaging Analysis Summary
Here is the executive summary formatted as a JIRA comment in JIRA wiki markup:
Executive Summary: RapidFuzz Packaging Analysis
RapidFuzz (v3.14.3) is a high-performance fuzzy string matching library with ~122M monthly PyPI downloads, licensed under MIT with no redistribution concerns. The package requires a source build with C++ extensions to deliver production-grade performance — without them, the pure Python fallback degrades throughput by 10-100x. The build system uses scikit-build-core with a CMake backend and requires a C++17 compiler, CMake >= 3.15, and Python development headers. Critically, the sdist on PyPI includes pre-generated Cython .cxx files, which eliminates the Cython build dependency and simplifies the build pipeline.
There are no blocking dependencies or known build issues for Linux x86_64. All previously reported build failures (PEP 517 sdist builds, missing CMake dependency detection, libc+-19 compatibility) have been resolved in v3.14.3. The package has zero mandatory runtime dependencies — numpy is optional and only needed for matrix operations. Build requirements are minimal: gcc-c, cmake, python3-devel, and scikit-build-core. The vendored C+ libraries (rapidfuzz-cpp, taskflow) are both MIT-licensed and header-only, with no external system library dependencies beyond standard C++ runtime and libatomic.
The single most important configuration for packaging is setting the environment variable RAPIDFUZZ_BUILD_EXTENSION=true. Without this, a failed C++ compilation will silently produce a pure Python wheel with severely degraded performance. With this flag, the build fails loudly on compilation errors, which is the desired behavior for controlled packaging environments. Since RapidFuzz performs pure CPU string matching with no GPU code, a single wheel serves all three index targets (CPU, CUDA, ROCm) identically — no index-specific builds are required.
Key Build Command
dnf install gcc-c++ cmake python3-devel
export RAPIDFUZZ_BUILD_EXTENSION=true
pip wheel --no-binary :all: rapidfuzz==3.14.3
Validation Checks
- import rapidfuzz succeeds
- rapidfuzz.fuzz.ratio("test", "test") returns 100.0
- import rapidfuzz.fuzz_cpp does not raise ImportError (confirms C++ extensions are present)
Key Findings
- License: MIT — fully compliant for redistribution, including vendored dependencies
- Blockers: None identified for Linux x86_64
- Risk: Low — mature build system, active maintenance, comprehensive CI across Python 3.10–3.14
- Single wheel per architecture: No GPU dependencies, one build covers CPU/CUDA/ROCm indexes
- is blocked by
-
AIPCC-10821 Add rapidfuzz into the RHAI pipeline onboarding collection
-
- In Progress
-
- mentioned on