Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-18772

MCO keeps attempting to pull baremetalRuntimeCfg image again and again

XMLWordPrintable

    • +
    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, the Machine Config Operator was pulling the `baremetalRuntimeCfgImage` container image multiple times: the first time to obtain node details and subsequent times to verify that the image is available. This caused issues during certificate rotation in situations where the mirror server or Quay was not available, and subsequent image pulls would fail. However, if the image is already on the nodes due to the first image pull then the nodes should start the kubelet regardless. With this update, the `baremetalRuntimeCfgImage` image is only pulled one time, thereby resolving the issue. (https://issues.redhat.com/browse/OCPBUGS-18772[*OCPBUGS-18772*])
      Show
      * Previously, the Machine Config Operator was pulling the `baremetalRuntimeCfgImage` container image multiple times: the first time to obtain node details and subsequent times to verify that the image is available. This caused issues during certificate rotation in situations where the mirror server or Quay was not available, and subsequent image pulls would fail. However, if the image is already on the nodes due to the first image pull then the nodes should start the kubelet regardless. With this update, the `baremetalRuntimeCfgImage` image is only pulled one time, thereby resolving the issue. ( https://issues.redhat.com/browse/OCPBUGS-18772 [* OCPBUGS-18772 *])
    • Bug Fix
    • Done

      MCO installs resolve-prepender NetworkManager script on the nodes. In order to find out node details it needs to pull baremetalRuntimeCfgImage. However, this image needs to be pulled just the first time, in the followup attempts this script just verifies that this image is available.

      This is not desirable in situations where mirror / quay are not available or having a temporary problem - these kind of issues should not prevent the node from starting kubelet. During certificate rotation testing I noticed that the node with a significant time skew won't start kubelet, as it tries to pull baremetalRuntimeCfgImage for kubelet to start - but the image is already on the nodes and it doesn't need refreshing.

              vrutkovs@redhat.com Vadim Rutkovsky
              vrutkovs@redhat.com Vadim Rutkovsky
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

                Created:
                Updated:
                Resolved: