Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-35818

[4.16] dev-scripts fails bootstrapping OCP 4.16 and greater with MIRROR_IMAGES=true AND INSTALLER_PROXY=true

XMLWordPrintable

    • No
    • Metal Platform 255, Metal Platform 256
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • A regression in 4.16.0 caused new baremetal IPI installations to fail when a proxy are used. This was caused by one of the services in the bootstrap VM trying to access IP address 0.0.0.0 through the proxy. Now this service no longer accesses 0.0.0.0.
    • Bug Fix
    • In Progress

      Description of problem:

          Trying to execute https://github.com/openshift-metal3/dev-scripts to deploy an OCP 4.16 or 4.17 cluster (with the same configuration OCP 4.14 and 4.15 are instead working) with:
       MIRROR_IMAGES=true
       INSTALLER_PROXY=true
      
      the bootstrap process fails with:
      
       level=debug msg=    baremetalhost resource not yet available, will retry
      level=debug msg=    baremetalhost resource not yet available, will retry
      level=info msg=  baremetalhost: ostest-master-0: uninitialized
      level=info msg=  baremetalhost: ostest-master-0: registering
      level=info msg=  baremetalhost: ostest-master-1: uninitialized
      level=info msg=  baremetalhost: ostest-master-1: registering
      level=info msg=  baremetalhost: ostest-master-2: uninitialized
      level=info msg=  baremetalhost: ostest-master-2: registering
      level=info msg=  baremetalhost: ostest-master-1: inspecting
      level=info msg=  baremetalhost: ostest-master-2: inspecting
      level=info msg=  baremetalhost: ostest-master-0: inspecting
      E0514 12:16:51.985417   89709 reflector.go:147] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *unstructured.Unstructured: Get "https://api.ostest.test.metalkube.org:6443/apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts?allowWatchBookmarks=true&resourceVersion=5466&timeoutSeconds=547&watch=true": Service Unavailable
      W0514 12:16:52.979254   89709 reflector.go:539] k8s.io/client-go/tools/watch/informerwatcher.go:146: failed to list *unstructured.Unstructured: Get "https://api.ostest.test.metalkube.org:6443/apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts?resourceVersion=5466": Service Unavailable
      E0514 12:16:52.979293   89709 reflector.go:147] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *unstructured.Unstructured: failed to list *unstructured.Unstructured: Get "https://api.ostest.test.metalkube.org:6443/apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts?resourceVersion=5466": Service Unavailable
      E0514 12:37:01.927140   89709 reflector.go:147] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *unstructured.Unstructured: Get "https://api.ostest.test.metalkube.org:6443/apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts?allowWatchBookmarks=true&resourceVersion=7800&timeoutSeconds=383&watch=true": Service Unavailable
      W0514 12:37:03.173425   89709 reflector.go:539] k8s.io/client-go/tools/watch/informerwatcher.go:146: failed to list *unstructured.Unstructured: Get "https://api.ostest.test.metalkube.org:6443/apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts?resourceVersion=7800": Service Unavailable
      E0514 12:37:03.173473   89709 reflector.go:147] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *unstructured.Unstructured: failed to list *unstructured.Unstructured: Get "https://api.ostest.test.metalkube.org:6443/apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts?resourceVersion=7800": Service Unavailable
      level=debug msg=Fetching Bootstrap SSH Key Pair...
      level=debug msg=Loading Bootstrap SSH Key Pair...
      
      it looks like up to a certain point https://api.ostest.test.metalkube.org:6443 was reachable but then for some reason it started failing because its not using the proxy or is and it shouldn't be (???)
      
      The 3 master nodes are reported as:
      [root@ipi-ci-op-0qigcrln-b54ee-1790684582253694976 home]# oc get baremetalhosts -A
      NAMESPACE               NAME              STATE        CONSUMER                ONLINE   ERROR              AGE
      openshift-machine-api   ostest-master-0   inspecting   ostest-bbhxb-master-0   true     inspection error   24m
      openshift-machine-api   ostest-master-1   inspecting   ostest-bbhxb-master-1   true     inspection error   24m
      openshift-machine-api   ostest-master-2   inspecting   ostest-bbhxb-master-2   true     inspection error   24m
      
      With something like:
      
       status:
        errorCount: 5
        errorMessage: 'Failed to inspect hardware. Reason: unable to start inspection: Validation
          of image href http://0.0.0.0:8084/34427934-f1a6-48d6-9666-66872eec9ba2 failed,
          reason: Got HTTP code 503 instead of 200 in response to HEAD request.'
        errorType: inspection error
      
      on their status

      Version-Release number of selected component (if applicable):

          4.16, 4.17

      How reproducible:

          100%

      Steps to Reproduce:

          1. Try to create an OCP 4.16 cluster with dev-scrips with IP_STACK=v4, MIRROR_IMAGES=true and INSTALLER_PROXY=true
          2.
          3.
          

      Actual results:

          level=info msg=  baremetalhost: ostest-master-0: inspecting
      E0514 12:16:51.985417   89709 reflector.go:147] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *unstructured.Unstructured: Get "https://api.ostest.test.metalkube.org:6443/apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts?allowWatchBookmarks=true&resourceVersion=5466&timeoutSeconds=547&watch=true": Service Unavailable

      Expected results:

          Successful deployment

      Additional info:

      I'm using IP_STACK=v4, MIRROR_IMAGES=true and INSTALLER_PROXY=true
      with the same configuration (MIRROR_IMAGES=true and INSTALLER_PROXY=true) OCP 4.14 and OCP 4.15 are working.
      
      When removing INSTALLER_PROXY=true, OCP 4.16 is also working.
      
      I'm going to attach bootstrap gather logs

            rhn-engineering-hpokorny Honza Pokorny
            stirabos Simone Tiraboschi
            Jad Haj Yahya Jad Haj Yahya
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated: