-
Story
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
[Registry] Enable Sparse Manifest List Acceptance
Summary
Modify the registry manifest PUT endpoint to accept manifest lists where not all child manifests exist locally when sparse mode is enabled. This leverages the existing FEATURE_SPARSE_INDEX configuration and LazyManifestLoader class to enable sparse manifest list storage while preserving the original digest.
Acceptance Criteria
- [ ] When FEATURE_SPARSE_INDEX=True, manifest lists can be pushed without all child manifests present
- [ ] Original manifest list digest is preserved exactly as pushed
- [ ] When FEATURE_SPARSE_INDEX=False (default), behavior is unchanged (all children required)
- [ ] SPARSE_INDEX_REQUIRED_ARCHS configuration controls which architectures must be present
- [ ] Clear error messages when required architecture is missing
- [ ] Existing full manifest lists continue to work normally
Technical Requirements
Configuration Verification
Files: config.py, util/config/schema.py
The existing configuration already supports sparse indexes (added in commit 7eacbffd1):
# config.py (line 957) FEATURE_SPARSE_INDEX = False SPARSE_INDEX_REQUIRED_ARCHS: List[str] = []
Verify these configurations are properly exposed and documented.
Manifest PUT Endpoint Enhancement
File: endpoints/v2/manifest.py
The manifest PUT endpoints (write_manifest_by_tagname at line 293, write_manifest_by_digest at line 319) need to handle sparse manifests when the feature is enabled.
Key changes:
1. When parsing manifest lists/indexes, use the sparse-aware LazyManifestLoader
2. Allow manifest creation to succeed even if some child manifests are missing (when allowed by config)
3. Preserve exact manifest bytes for digest consistency
LazyManifestLoader Integration
File: image/shared/schemautil.py
The LazyManifestLoader class (line 60+) already implements sparse index logic:
def _is_sparse_index_enabled(self):
return self._app_config.get("FEATURE_SPARSE_INDEX", False)
def _get_required_archs(self):
return self._app_config.get("SPARSE_INDEX_REQUIRED_ARCHS", [])
def _is_architecture_required(self):
# Returns True if this architecture must be present
# Returns False if this architecture can be skipped when missing
Ensure this is properly utilized when processing manifest list pushes.
OCI Index and Docker Manifest List
Files: image/oci/index.py, image/docker/schema2/list.py
Both already use LazyManifestLoader with sparse support. Verify:
- OCIIndex properly handles missing manifests for optional architectures
- DockerSchema2ManifestList properly handles missing manifests for optional architectures
Manifest Model Updates
File: data/model/oci/manifest.py
Ensure manifest creation functions handle the case where child manifests don't exist:
- create_manifest_and_retarget_tag() should work with sparse indexes
- Child manifest references should be stored even if content isn't present locally
Implementation Notes
Existing Patterns to Follow
- Manifest parsing: See _parse_manifest() in endpoints/v2/manifest.py
- LazyManifestLoader: Already implemented in image/shared/schemautil.py
- Feature flag usage: See features.REPO_MIRROR pattern in workers
Key Behaviors
- Sparse mode disabled (default):
- All child manifests must exist before manifest list can be stored
- ManifestException raised if any child is missing
- Current behavior unchanged
- Sparse mode enabled:
- Required architectures (in SPARSE_INDEX_REQUIRED_ARCHS) must exist
- Optional architectures can be missing
- If SPARSE_INDEX_REQUIRED_ARCHS is empty, all architectures are optional
- Manifest list stored with original bytes/digest
Digest Preservation
Critical requirement: The manifest list digest must match the original exactly.
# The manifest list bytes must be stored AS-IS # Do NOT modify or regenerate the manifest list manifest_bytes = request.data # Original bytes digest = sha256_digest(manifest_bytes) # Must match client's expected digest
Error Handling
When a required architecture is missing:
raise ManifestException(
message=f"Required architecture '{arch}' manifest not found",
detail={
"architecture": arch,
"digest": child_digest,
"required_architectures": required_archs,
}
)
Dependencies
- None (foundational story, uses existing sparse index implementation)
Testing Requirements
Unit Tests
File: image/shared/test/test_sparse_index.py (extend existing)
The existing test file has comprehensive tests. Add additional tests for:
def test_sparse_manifest_push_accepted_when_enabled():
"""Test manifest list push succeeds with missing optional architectures."""
def test_sparse_manifest_push_rejected_when_disabled():
"""Test manifest list push fails with missing architectures when disabled."""
def test_required_architecture_must_exist():
"""Test push fails when required architecture is missing."""
def test_digest_preserved_for_sparse_manifest():
"""Test manifest list digest matches original after sparse push."""
Registry Protocol Tests
File: test/registry/ (extend existing)
Add protocol tests for:
- PUT manifest list with missing child manifests
- GET manifest list that is sparse (returns original bytes)
- HEAD manifest list returns correct digest
Integration Tests
Test end-to-end workflow:
1. Push individual architecture manifests for amd64 only
2. Push manifest list referencing amd64, arm64, ppc64le (with FEATURE_SPARSE_INDEX=True)
3. Verify manifest list is stored with original digest
4. Verify amd64 manifest can be pulled
5. Verify arm64/ppc64le return appropriate error
Definition of Done
- [ ] Code implemented and follows project conventions
- [ ] All acceptance criteria met
- [ ] Unit tests written and passing
- [ ] Registry protocol tests written and passing
- [ ] Existing tests continue to pass (no regressions)
- [ ] Documentation updated for configuration options
- [ ] Code reviewed and approved