Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-13809

OVN image pre-puller pod uses `imagePullPolicy: Always` and blocks upgrade when there is no registry

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • 4.12
    • None
    • No
    • False
    • Hide

      None

      Show
      None

      This is a clone of issue OCPBUGS-13219. The following is the description of the original issue:

      Description of problem:

      OVN image pre-puller blocks upgrades in environments where the images have already been pulled but the registry server is not available.

      Version-Release number of selected component (if applicable):

      4.12

      How reproducible:

      Always

      Steps to Reproduce:

      1. Create a cluster in a disconnected environment.

      2. Manually pre-pull all the images required for the upgrade. For example, get the list of images needed:

      # oc adm release info quay.io/openshift-release-dev/ocp-release:4.12.10-x86_64 -o json > release-info.json
      

      And then pull them in all the nodes of the cluster:

      # crio pull $(cat release-info.json | jq -r '.references.spec.tags[].from.name')
      

      3. Stop or somehow make the registry unreachable, then trigger the upgrade.

      Actual results:

      The upgrade blocks with the following error reported by the cluster version operator:

      # oc get clusterversion; oc get co network
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.12.10   True        True          62m     Working towards 4.12.11: 483 of 830 done (58% complete), waiting on network
      NAME      VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      network   4.12.10   True        True          False      133m    DaemonSet "/openshift-ovn-kubernetes/ovnkube-upgrades-prepuller" is not available (awaiting 1 nodes)
      

      The reason for that is that the `ovnkube-upgrades-prepuller-...` pod uses `imagePullPolicy: Always` and that fails if there is no registry, even if the image has already been pulled:

      # oc get pods -n openshift-ovn-kubernetes ovnkube-upgrades-prepuller-5s2cn
      NAME                               READY   STATUS             RESTARTS   AGE
      ovnkube-upgrades-prepuller-5s2cn   0/1     ImagePullBackOff   0          44m
      
      # oc get events -n openshift-ovn-kubernetes --field-selector involvedObject.kind=Pod,involvedObject.name=ovnkube-upgrades-prepuller-5s2cn,reason=Failed
      LAST SEEN   TYPE      REASON   OBJECT                                 MESSAGE
      43m         Warning   Failed   pod/ovnkube-upgrades-prepuller-5s2cn   Failed to pull image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:52f189797a83cae8769f1a4dc6dfd46d586914575ee99de6566fc23c77282071": rpc error: code = Unknown desc = (Mirrors also failed: [server.home.arpa:8443/openshift/release@sha256:52f189797a83cae8769f1a4dc6dfd46d586914575ee99de6566fc23c77282071: pinging container registry server.home.arpa:8443: Get "https://server.home.arpa:8443/v2/": dial tcp 192.168.100.1:8443: connect: connection refused]): quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:52f189797a83cae8769f1a4dc6dfd46d586914575ee99de6566fc23c77282071: pinging container registry quay.io: Get "https://quay.io/v2/": dial tcp: lookup quay.io on 192.168.100.1:53: server misbehaving
      43m         Warning   Failed   pod/ovnkube-upgrades-prepuller-5s2cn   Error: ErrImagePull
      43m         Warning   Failed   pod/ovnkube-upgrades-prepuller-5s2cn   Error: ImagePullBackOff
      
      # oc get pod -n openshift-ovn-kubernetes ovnkube-upgrades-prepuller-5s2cn -o json | jq -r '.spec.containers[] | .imagePullPolicy + " " + .image'
      Always quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:52f189797a83cae8769f1a4dc6dfd46d586914575ee99de6566fc23c77282071
      

      Expected results:

      The upgrade should not block.

      Additional info:

      We detected this in a situation where we want to be able to perform upgrades in a disconnected environment and without the registry server running. See MGMT-13733 for details.

            jhernand-rh Juan Hernández
            openshift-crt-jira-prow OpenShift Prow Bot
            Mike Fiedler Mike Fiedler
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: