• Pin and pre-load images
    • False
    • None
    • False
    • Not Selected
    • To Do
    • OCPSTRAT-763 - [TechPreview]Disconnected Cluster Update and Boot without local image registry - phase 1
    • OCPSTRAT-763[TechPreview]Disconnected Cluster Update and Boot without local image registry - phase 1
    • 0% To Do, 0% In Progress, 100% Done
    • 0

      Given enhancement - https://github.com/openshift/enhancements/pull/1481

      Design Review Doc: https://docs.google.com/document/d/1-XuHN6_jvJMLULFwwAThfIcHqY32s32lU6m4bx7BiBE/edit

      We want to allow the relevant APIs to pin images and make sure those don't get garbage collected.

      Here is a summary of what will be required:

      1. CRI-O will need to be changed so that it doesn't remove pinned images, regardless of the version_file_persist setting.
      2. Add the new PinnedImageSet custom resource definition to the API.
      3. Initial proposal: #1609
      4. Add a new PinnedIMageSetController to the machine-config-controller.
      5. Add the logic to pin and pull the images to the machine-config-daemon.
      6. Update the documentation of recovery procedures to explain that pinned images shouldn't be removed.

            [MCO-838] [TechPreview] Pin and pre-load images

            rhn-support-mrussell about the cloning - is there a process I should follow?

            Oved Ourfali added a comment - rhn-support-mrussell about the cloning - is there a process I should follow?

            I think it should.

            Any automated process to follow? or just to clone this ticket?

            Oved Ourfali added a comment - I think it should. Any automated process to follow? or just to clone this ticket?

            Should this be cloned into a GA card? 

            Mark Russell added a comment - Should this be cloned into a GA card? 

            This will make it in 4.16.

            augol from the QE side, are we clear? Any remaining work?

            Oved Ourfali added a comment - This will make it in 4.16. augol from the QE side, are we clear? Any remaining work?

            chezhang@redhat.com rhn-support-rioliu Please assign a QA contact for this epic.

            Roshni Pattath added a comment - chezhang@redhat.com rhn-support-rioliu Please assign a QA contact for this epic.

            Sam Batschelet added a comment - - edited

            A hypothetical example might look like this.

            apiVersion: machineconfiguration.openshift.io/v1
            kind: PinnedImageSet
            metadata:
             name: ensure-workload
            spec:
             machineConfigPoolSelector:
                matchLabels:
                  pools.operator.machineconfiguration.openshift.io/worker: ""
             pinnedImages:
               - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7aa95f32af51fc7892546a1e028808ec1bab1e507cf671b88d8280d2521e61d6
               - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d98ddbe73bda2ffed4d1aeb52be0500b8f8fe870cb465a8bb0cb113f7ed5ade3 

            Sam Batschelet added a comment - - edited A hypothetical example might look like this. apiVersion: machineconfiguration.openshift.io/v1 kind: PinnedImageSet metadata: name: ensure-workload spec: machineConfigPoolSelector: matchLabels: pools. operator .machineconfiguration.openshift.io/worker: "" pinnedImages: - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7aa95f32af51fc7892546a1e028808ec1bab1e507cf671b88d8280d2521e61d6 - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d98ddbe73bda2ffed4d1aeb52be0500b8f8fe870cb465a8bb0cb113f7ed5ade3

            > would this allow a customer who had a critical application, even in a non-disconnected cluster to leverage the pin/pre-load process to ensure that, in the case of a registry outage, their application could continue to function?

            My understanding is if the image has been prefetched and or the image exists already on the node and proper imagePullPolicy is defined that it should work. In this case, the CU would define the workload on a node pool level. IE worker. Then anytime that workload is scheduled to a worker node it would work regardless of registry status as the image already exists thus should not require additional work by kubelet/CRI-O.

            Sam Batschelet added a comment - > would this allow a customer who had a critical application, even in a non-disconnected cluster to leverage the pin/pre-load process to ensure that, in the case of a registry outage, their application could continue to function? My understanding is if the image has been prefetched and or the image exists already on the node and proper imagePullPolicy is defined that it should work. In this case, the CU would define the workload on a node pool level. IE worker. Then anytime that workload is scheduled to a worker node it would work regardless of registry status as the image already exists thus should not require additional work by kubelet/CRI-O.

            Subtasks have been fleshed out and updated.

            Sam Batschelet added a comment - Subtasks have been fleshed out and updated.

            The MCO PR[1] reflects the current status of this work. At this point, no major blocking issues exist and we are moving into testing and review.

             [1]https://github.com/openshift/machine-config-operator/pull/4094

            Sam Batschelet added a comment - The MCO PR [1] reflects the current status of this work. At this point, no major blocking issues exist and we are moving into testing and review.   [1] https://github.com/openshift/machine-config-operator/pull/4094

            rh-ee-sbatsche any updates on the progress?

            Oved Ourfali added a comment - rh-ee-sbatsche any updates on the progress?

            Syncing with MCO team today to finalize last outstanding issues. Will update status here after that call.

            Sam Batschelet added a comment - Syncing with MCO team today to finalize last outstanding issues. Will update status here after that call.

            Thanks Sohan! For now, I have assigned that story to you. Feel free to reassign to other team members based on who will be working on the implementation.

            Sinny Kumari added a comment - Thanks Sohan! For now, I have assigned that story to you. Feel free to reassign to other team members based on who will be working on the implementation.

            Sohan Kunkerkar added a comment - - edited

            rhn-engineering-skumari I don't think this feature is prioritized in cri-o for 4.16
            pehunt@redhat.com do you have any insights on this?

            Edit:
            I added a story attached to this epic after discussing it with pehunt@redhat.com 

            Sohan Kunkerkar added a comment - - edited rhn-engineering-skumari I don't think this feature is prioritized in cri-o for 4.16 pehunt@redhat.com do you have any insights on this? Edit: I added a story attached to this epic after discussing it with pehunt@redhat.com  

            skunkerk Do you know if functionality listed in the first bullet i.e. "CRI-O will need to be changed so that it doesn't remove pinned images, regardless of the version_file_persist setting" is targeted in cri-o for 4.16 timeframe?

            Sinny Kumari added a comment - skunkerk Do you know if functionality listed in the first bullet i.e. "CRI-O will need to be changed so that it doesn't remove pinned images, regardless of the version_file_persist setting" is targeted in cri-o for 4.16 timeframe?

            julim 

            Yes, we are still "in planning" but it is considered a priority issue for 4.16.

            Mark Russell added a comment - julim   Yes, we are still "in planning" but it is considered a priority issue for 4.16.

            Oved Ourfali added a comment - - edited

            Oved Ourfali added a comment - - edited CC rhn-engineering-skumari rhn-support-mrussell lmohanty@redhat.com jhernand-rh  

              rh-ee-sbatsche Sam Batschelet
              oourfali Oved Ourfali
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

                Created:
                Updated:
                Resolved: