XMLWordPrintable

    • Product / Portfolio Work
    • None
    • 67% To Do, 0% In Progress, 33% Done
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Feature Overview (aka. Goal Summary)

      Accelerate container startup times by enabling lazy image pulling in OpenShift through a plugin-based architecture for CRI-O. This allows containers to start before the entire image is downloaded, addressing the problem where large AI/ML workload images cause significant delays in container boot time.

      Key insight: Image pull operations account for ~70% of container startup time.

      Target: OpenShift 4.22 (Tech Preview)
      Approach: eStargz via stargz-store (proven from OCPNODE-2204)
      Previous Attempt: OCPNODE-2204 (Support lazy image pull via stargz - Stale/Abandoned)


      Problem Statement

      When creating containers in OpenShift, the entire image must be pulled from a registry before the container can start. For large images (common in AI/ML workloads), this creates significant delays:

      • Large AI/ML model images (multi-GB) take minutes to pull completely
      • Containers cannot start until 100% of the image is downloaded
      • This delays application availability and impacts user experience

      Impact: Slow pod startup times for AI/ML workloads, poor user experience, reduced agility

      Competition: AWS Fargate (SOCI), AWS ECS, and other cloud providers already offer lazy pulling capabilities.


      Solution Overview

      Implement lazy image pulling for CRI-O using eStargz format via stargz-store plugin.

      Core Capabilities:

      1. Container starts after downloading only required chunks
      2. Additional image data fetched on-demand during runtime
      3. Workload-based optimization to prefetch likely-accessed files
      4. Compatible with OCI/Docker images and standard container registries

      What We Need to Build:

      • OpenShift API - ContainerRuntimeConfig extension for lazy pull configuration
        • Enable/disable lazy pulling
        • Configure storage plugin settings
        • Extensible design to support additional formats in future releases
      • MCO Integration - Translate API to CRI-O/storage configuration
        • Generate storage.conf with Additional Layer Store settings
        • Note: Customers bring their own storage plugin binaries (e.g., stargz-store)
        • MCO only manages configuration, not plugin deployment

      Implementation Path:

      The Additional Layer Store API already provides the plugin architecture we need. We don't need to build a new plugin system or proxy layer. We need to:

      • Design OpenShift API to be extensible for different storage plugins (eStargz, Nydus, etc.)
      • Integrate with MCO for configuration management (not deployment)
      • Document compatible storage plugins and customer installation procedures

       


      Previous Attempt: OCPNODE-2204

      Previous attempt (OCPNODE-2204) focused on stargz-snapshotter integration but was abandoned. Some artifacts may be useful for reference:


      Research Findings

      Lazy-Pulling Solutions for CRI-O

      CRI-O has experimental lazy pull support since v1.22 using the Additional Layer Store plugin mechanism.

      eStargz via stargz-store - Recommended for CRI-O:

      Alternatives not viable:

      • Nydus: CRI-O plugin exists but dormant since Aug 2022, no production evidence
      • SOCI: Containerd-only architecture, not compatible with CRI-O
      • zstd:chunked: Partial pulling only, not true lazy pulling

      CRI-O Extension Points

      Additional Layer Store API (Primary Extension Point):

      • Location: container-libs/storage library (used by CRI-O)
      • Status: Experimental (can change without major version bump)
      • Used by: Stargz Store plugin (proven)
      • API: LookupAdditionalLayer(tocDigest digest.Digest, imageref string) (AdditionalLayer, error)

      CRI-O Work Required:

      None. CRI-O already reads the Additional Layer Store configuration from /etc/containers/storage.conf automatically. The API exists and works - it's just marked as experimental.

      Work is needed in container-libs/storage, not CRI-O

      Image Format

      eStargz - Chosen for initial implementation:

      • Mature tooling, used in OCPNODE-2204 experimental work
      • No chunk verification on CRI-O (security/performance tradeoff)
      • Requires image conversion
      • API designed to support additional formats in future releases

      Registry Requirements

      HTTP Range Request Support Required for lazy pulling:

      Lazy pulling technologies (eStargz, Nydus) require HTTP range requests to function. The mechanism works by:

      1. Fetching the TOC (Table of Contents) from the layer footer using a range request
      2. Issuing range requests to download only required chunks when files are accessed
      3. Without range requests, the entire layer must be downloaded (defeating lazy pulling)

      Reference: stargz-snapshotter documentation states it "combines its image format with HTTP Range Request supported by OCI Distribution Spec and Docker Registry API to selectively download file entries from registries."

      Registry compatibility:

      Registry Lazy Pull Support Notes
      Docker Hub Yes OCI Distribution v2 spec compliant
      GitHub Container Registry Yes Full support
      Amazon ECR Yes Native SOCI support (though we can't use it)
      Quay / Red Hat Quay Yes Range request support depends on storage backend
      Harbor Yes Additional acceld integration available
      Google Artifact Registry Yes Full support

       

      Performance Expectations

      eStargz on CRI-O:

      • Performance data needs to be gathered through validation and testing
      • Image pull operations account for 76% of container startup time (opportunity space)
      • Actual improvement depends on image size, network conditions, and workload patterns

      Upstream Contributions

      What Upstream Work is Required?

      container-libs/storage (Additional Layer Store API):

      • Stabilize Additional Layer Store API (High Priority)
        • Current status: Experimental, can change without major version bump
        • Known issues: Mount/unmount locking problems, can break storage
        • Impact: Required for production-ready lazy pulling
        • Effort: Requires upstream community agreement and testing
      • Enable chunk verification for eStargz (Medium Priority)
        • Current status: Disabled on CRI-O (only works with containerd)
        • Security impact: Without verification, corrupted chunks not detected
        • Effort: Port verification logic from stargz-snapshotter

      stargz-snapshotter:

      • Already mature and maintained
      • May need minor updates for new CRI-O versions
      • Low maintenance burden

      Timeline Considerations:

      • Stabilizing Additional Layer Store API is multi-month effort
      • Not required for Tech Preview (can ship on experimental API)
      • Should be completed before GA
      • Chunk verification is nice-to-have, not blocking

      Why Not SOCI?

      SOCI (Seekable OCI) is containerd-specific and not directly compatible with CRI-O:

      • SOCI implements containerd's snapshotter gRPC plugin interface, not the Additional Layer Store API
      • Mount-based architecture vs directory-based structure
      • SOCI maintainers have stated CRI-O/Podman support is out of scope for their project

      For CRI-O, eStargz via stargz-store is the most proven approach (used in OCPNODE-2204).


      User Stories

      • AI/ML Platform Operator: "I want containers with large model images to start immediately without waiting for full image download."
      • OpenShift Administrator: "I want to reduce pod startup time for AI workloads so that autoscaling is more responsive."
      • Application Developer: "I want my containers to start quickly even with large images."
      • Enterprise Architect: "I want a lazy-pulling solution that works with CRI-O."

      Acceptance Criteria

      • Lazy image pulling works with CRI-O
      • Measurable container startup time improvement for large images (>5GB)
      • No regressions for standard image pulling
      • Works with Docker Hub and private registries (Quay compatibility documented)
      • Clear documentation on setup and usage
      • Minimal operational complexity

      Dependencies

      Technical Dependencies

      • container-libs/storage upstream work: Stabilize Additional Layer Store API (currently experimental)
      • OpenShift enhancement: Define API for lazy pull configuration
      • MCO feature: Translate API to CRI-O configuration (configuration only, not plugin deployment)
      • Documentation: Customer installation guides for compatible storage plugins (stargz-store, nydus-store, etc.)
      • Registry support: HTTP range request support required (Quay)

      Documentation

      • OSDOCS-10167: Documentation (To Do)
      • Implementation guide
      • Customer storage plugin installation guides (critical for BYOS approach)
        • How to install and configure stargz-store
        • How to install and configure nydus-store
        • Prerequisites and compatibility matrix
      • Performance benchmarks

      Success Metrics

      1. Startup Performance: Measurable reduction in container startup time for large images
      2. Adoption: AI/ML teams successfully using lazy pulling
      3. Stability: No production incidents
      4. Compatibility: Works with major registries
      5. Simplicity: Easy to configure and use

      Risks & Mitigation

      Risk Mitigation
      Additional Layer Store API is experimental Work with upstream to stabilize API; plan for potential breaking changes
      Quay compatibility unclear Test with different Quay storage backends; document limitations; consider registry proxy
      No chunk verification on eStargz Accept security tradeoff for Tech Preview; work on upstream support for GA
      Image conversion overhead Provide tooling and documentation for eStargz conversion
      Performance may not meet expectations Validate early with realistic workloads; gather data before setting public targets
      Customer burden with BYOS approach Provide clear documentation and validated installation guides; consider community container images for common plugins
      Support complexity with third-party plugins Document supported plugin versions; clear boundaries on what OpenShift supports (API/config) vs customer responsibility (plugin binaries)
      Plugin installation fragility Test and document installation procedures; provide health checks/validation tools

      Open Questions

      • Can we ship Tech Preview on experimental API?
        • Additional Layer Store API is currently experimental
        • Question for OpenShift: Is experimental upstream API acceptable for Tech Preview?
        • Impact: API could change without major version bump
      • Timeline and resources
        • Is 4.22 timeline feasible for Tech Preview?
        • Resources available for upstream contribution (stabilizing Additional Layer Store API)?

      Next Steps

      1. Decision Phase
        • Research complete - eStargz recommended
        • Architectural decision: "Bring Your Own Store" (BYOS) approach - OpenShift provides API/config only
        • [ ] Confirm Tech Preview can ship on experimental Additional Layer Store API
      1. Design Phase
        • [ ] Write OpenShift enhancement proposal
          • Define OpenShift API for lazy pull configuration
          • Plugin-agnostic design - support multiple storage backends (eStargz, Nydus, etc.)
          • Document Quay limitations and HTTP range request requirement
          • Document BYOS approach and customer responsibilities
        • [ ] Design MCO integration
          • API translation to CRI-O config
          • Configuration management only (no plugin binary deployment)
          • Node-level configuration
      1. Validation Phase (Parallel with design)
        • [ ] Validate stargz-store works with current CRI-O versions (customer-installed)
        • [ ] Test with different registries (Docker Hub, ghcr.io, Quay with various backends)
        • [ ] Performance testing with AI/ML workload images
        • [ ] Document and validate customer installation procedures for common storage plugins
      1. Upstream Work (Parallel, not blocking for Tech Preview)
        • [ ] Contribute to stabilizing Additional Layer Store API
        • [ ] Work on chunk verification support
      1. Implementation Phase
        • [ ] Implement OpenShift API (plugin-agnostic, extensible for multiple storage backends)
        • [ ] Implement MCO integration (configuration only)
        • [ ] Testing and validation
        • [ ] Documentation (OSDOCS-10167) - include customer storage plugin installation guides
      1. Cleanup Phase
        • [ ] Revert/cleanup prior packaging work (OCPNODE-2210, ART-10021)
        • [ ] Notify Ramesh that branch setup work is no longer needed

      References

      JIRA Issues

      CRI-O

      OCPNODE-2204 Work (Reference)

      Competition


      Out of Scope

      • Switching to containerd runtime
      • Building images in lazy-pull format (belongs to build tools)
      • Automatic image conversion in registry
      • Bandwidth optimization beyond lazy pulling
      • Support for non-OCI image formats

              gausingh@redhat.com Gaurav Singh
              gausingh@redhat.com Gaurav Singh
              None
              Qi Wang, Sascha Grunert
              Ryan Phillips Ryan Phillips
              Aruna Naik Aruna Naik
              Matthew Werner Matthew Werner
              Derrick Ornelas Derrick Ornelas
              Votes:
              1 Vote for this issue
              Watchers:
              19 Start watching this issue

                Created:
                Updated: