Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-7890

MCO configurable timeout for pull operations during node startup phase

XMLWordPrintable

    • None
    • Product / Portfolio Work
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      1. Proposed title of this feature request

      MCO configurable timeout for pull operations during node startup phase

      2. What is the nature and description of the request?

      For OCP cluster installations and upgrades, the same images are pulled many times from Quay.io (for connected clusters), causing a high bandwidth requirement for the installation/upgrade to not fail, and also high bandwidth and load in Quay.io. The pull operations don't have any timeout, therefore when a pull operation suffers a network slowness it can potentially run forever. 

      3. Why does the customer need this? (List the business requirements here)

      There are a few experiences collected from our customers where one pull operation lasted many days avoiding the node to startup and becoming Ready. Usually in this scenario customers tend to replace the incriminated node.

      Example:

      Jun 11 02:59:02 ocp.node.example.com bash[1362]: Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2bb3d567a924a49b89891e0c5ae350eb6e6a3cd47e73684c55081da6d8136178...
      Jun 11 02:59:04 ocp.node.example.com bash[1362]: Getting image source signatures
      Jun 11 02:59:04 ocp.node.example.com bash[1362]: Copying blob sha256:1aa8962369339360de8e44dda0ec786b73bfe907a0b8625eba40a7b8f3fbc775
      Jun 11 02:59:04 ocp.node.example.com bash[1362]: Copying blob sha256:284956d03cd7ef0fdb50b8d84e8db5609095a28f9b525b39582cc9aa5e282d9f
      Jun 11 02:59:04 ocp.node.example.com bash[1362]: Copying blob sha256:25c75c34b2e2b68ba9245d9cddeb6b8a0887371ed30744064f85241a75704d87
      Jun 11 02:59:04 ocp.node.example.com bash[1362]: Copying blob sha256:7349f4df4822c25fe7cadc6d419b6badc34dd2029510b0b9a9db5aa755dd2c95
      Jun 15 23:55:26 ocp.node.example.com bash[1362]: Copying config sha256:46e7fd40d05777f62431d64acd1dd237e0daed740429f8d55dc92e6b85ed37b3
      Jun 15 23:55:26 ocp.node.example.com bash[1362]: Writing manifest to image destination
      Jun 15 23:55:30 ocp.node.example.com systemd[1]: Startup finished in 1.144s (kernel) + 1.834s (initrd) + 4d 20h 56min 32.378s (userspace) = 4d 20h 56min 35.357s.

      4. List any affected packages or components.

      Openshift Machine Config Operator, OpenShift Node

              rhn-support-mrussell Mark Russell
              rhn-support-gizzi Giovanni Luca Izzi
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                None
                None