-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
False
-
-
False
-
-
[3351641950] Upstream Reporter: Doug Hellmann
Upstream description:
We have seen some failures in fromager's bootstrap command due to a race condition with re-resolving top-level dependencies.
The error is
```
2025-08-21 15:13:01,699 DEBUG:fromager._main_:258: llama_stack_provider_lmeval: could not handle toplevel dependency llama_stack_provider_lmeval (0.2.2)
Traceback (most recent call last):
File "/opt/app-root/lib64/python3.12/site-packages/fromager/bootstrapper.py", line 488, in _handle_build_requirements
self.bootstrap(req=dep, req_type=build_type)
File "/opt/app-root/lib64/python3.12/site-packages/fromager/bootstrapper.py", line 135, in bootstrap
self._add_to_graph(req, req_type, resolved_version, source_url)
File "/opt/app-root/lib64/python3.12/site-packages/fromager/bootstrapper.py", line 934, in _add_to_graph
self.ctx.dependency_graph.add_dependency(
File "/opt/app-root/lib64/python3.12/site-packages/fromager/dependency_graph.py", line 235, in add_dependency
raise ValueError(
ValueError: Trying to add setuptools==80.8.0 to parent llama-stack-provider-lmeval==0.2.2 but llama-stack-provider-lmeval==0.2.2 does not existThe above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/app-root/lib64/python3.12/site-packages/fromager/_main_.py", line 256, in invoke_main
main(auto_envvar_prefix="FROMAGER")
File "/opt/app-root/lib64/python3.12/site-packages/click/core.py", line 1442, in _call_
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/app-root/lib64/python3.12/site-packages/click/core.py", line 1363, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/opt/app-root/lib64/python3.12/site-packages/click/core.py", line 1830, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/app-root/lib64/python3.12/site-packages/click/core.py", line 1226, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/app-root/lib64/python3.12/site-packages/click/core.py", line 794, in invoke
return callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/app-root/lib64/python3.12/site-packages/click/decorators.py", line 46, in new_func
return f(get_current_context().obj, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/app-root/lib64/python3.12/site-packages/click/decorators.py", line 34, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/app-root/lib64/python3.12/site-packages/fromager/commands/bootstrap.py", line 482, in bootstrap_parallel
ctx.invoke(
File "/opt/app-root/lib64/python3.12/site-packages/click/core.py", line 794, in invoke
return callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/app-root/lib64/python3.12/site-packages/click/decorators.py", line 46, in new_func
return f(get_current_context().obj, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/app-root/lib64/python3.12/site-packages/fromager/commands/bootstrap.py", line 184, in bootstrap
bt.bootstrap(req, requirements_file.RequirementType.TOP_LEVEL)
File "/opt/app-root/lib64/python3.12/site-packages/fromager/bootstrapper.py", line 256, in bootstrap
self._prepare_build_dependencies(req, sdist_root_dir, build_env)
File "/opt/app-root/lib64/python3.12/site-packages/fromager/bootstrapper.py", line 433, in _prepare_build_dependencies
self._handle_build_requirements(
File "/opt/app-root/lib64/python3.12/site-packages/fromager/bootstrapper.py", line 490, in _handle_build_requirements
raise ValueError(f"could not handleUnknown macro: {self._explain}") from err
ValueError: could not handle toplevel dependency llama_stack_provider_lmeval (0.2.2)
```I think the problem here is a race condition with resolving `llama_stack_provider_lmeval`.
Looking at the logs, I see it first pick up version 0.2.1 and then later 0.2.2. I made a change recently that causes the bootstrapper to always use the same version for a given requirement specifier, so that should be eliminated.
It's troubling that the 0.2.2 version wasn't automatically added to the graph, though. I think in this case that's because `llama_stack_provider_lmeval` is a top-level dependency, and those are added to the graph outside of the bootstrapper when the bootstrap command starts up and resolves them all to start. Then later when the rule is resolved again it gets a different answer and that version is not already in the graph.
I see in bootstrapper.py in `_add_to_graph` that the function returns if the dependency type is top-level, because it's assuming that those packages are already in the graph. I think that's a mistake, it needs a more careful check that includes the version number.
- is duplicated by
-
AIPCC-4312 Investigate 'builder' bootstrap failures for recently updated packages
-
- Closed
-
- links to