Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-16764

Can not ztp an SNO with 4.13 with FIPs on hub cluster

XMLWordPrintable

    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      While attempting to ZTP install an SNO from a fips enabled hub OCP cluster the baremetal operator is emitting certificate errors and preventing any further progress to install a spoke cluster.  It seems there is an issue with ironic proxy.

      Version-Release number of selected component (if applicable):

      OCP - 4.13.5

      How reproducible:

      Always

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

      4.13.5 with fips can not provision a fips spoke cluster
      4.13.5 without fips can provision a fips spoke cluster
      Given those above data points it seems that enabling fips is causing the certificate issue with BMO and Ironic.
      
      # oc get po -n openshift-machine-api 
      NAME                                                  READY   STATUS    RESTARTS      AGE
      cluster-autoscaler-operator-764b596f-qqgsj            2/2     Running   2 (90m ago)   107m
      cluster-baremetal-operator-6bffc644d4-687p6           2/2     Running   0             107m
      control-plane-machine-set-operator-76cbdb8bc9-bjrqb   1/1     Running   2 (90m ago)   107m
      ironic-proxy-cp5l7                                    1/1     Running   0             98m
      ironic-proxy-dh8w6                                    1/1     Running   0             98m
      ironic-proxy-tshlj                                    1/1     Running   0             72m
      machine-api-controllers-66c67d9b9b-g48c9              7/7     Running   7 (90m ago)   99m
      machine-api-operator-5f44bf6c94-8xn9q                 2/2     Running   1 (90m ago)   107m
      metal3-7bd76ff468-g54gz                               5/5     Running   0             99m
      metal3-image-customization-54d958b9d8-4zt29           1/1     Running   0             98m
      

       

      BMO Logs:

      # oc logs -n openshift-machine-api metal3-7bd76ff468-g54gz -c metal3-baremetal-operator
      ...
      {"level":"info","ts":1690305913.9285643,"logger":"controllers.BareMetalHost","msg":"provisioner is not ready","baremetalhost":"openshift-machine-api/e38-h02-000-r650","RequeueAfter:":30}
      {"level":"info","ts":1690305913.9447827,"logger":"controllers.BareMetalHost","msg":"start","baremetalhost":"openshift-machine-api/e38-h03-000-r650"}
      {"level":"info","ts":1690305913.9572442,"logger":"provisioner.ironic","msg":"error caught while checking endpoint","host":"openshift-machine-api~e38-h03-000-r650","endpoint":"https://localhost:6385/v1/","error":"Get \"https://localhost:6385/v1\": remote error: tls: handshake failure"}
      {"level":"info","ts":1690305913.95726,"logger":"controllers.BareMetalHost","msg":"provisioner is not ready","baremetalhost":"openshift-machine-api/e38-h03-000-r650","RequeueAfter:":30}
      {"level":"info","ts":1690305943.7857008,"logger":"controllers.BareMetalHost","msg":"start","baremetalhost":"openshift-machine-api/e38-h05-000-r650"}
      {"level":"info","ts":1690305943.7985663,"logger":"provisioner.ironic","msg":"error caught while checking endpoint","host":"openshift-machine-api~e38-h05-000-r650","endpoint":"https://localhost:6385/v1/","error":"Get \"https://localhost:6385/v1\": remote error: tls: handshake failure"}
      {"level":"info","ts":1690305943.798603,"logger":"controllers.BareMetalHost","msg":"provisioner is not ready","baremetalhost":"openshift-machine-api/e38-h05-000-r650","RequeueAfter:":30}
      {"level":"info","ts":1690305943.928948,"logger":"controllers.BareMetalHost","msg":"start","baremetalhost":"openshift-machine-api/e38-h02-000-r650"}
      {"level":"info","ts":1690305943.9416509,"logger":"provisioner.ironic","msg":"error caught while checking endpoint","host":"openshift-machine-api~e38-h02-000-r650","endpoint":"https://localhost:6385/v1/","error":"Get \"https://localhost:6385/v1\": remote error: tls: handshake failure"}
      {"level":"info","ts":1690305943.941683,"logger":"controllers.BareMetalHost","msg":"provisioner is not ready","baremetalhost":"openshift-machine-api/e38-h02-000-r650","RequeueAfter:":30}
      {"level":"info","ts":1690305943.9579823,"logger":"controllers.BareMetalHost","msg":"start","baremetalhost":"openshift-machine-api/e38-h03-000-r650"}
      {"level":"info","ts":1690305943.9706023,"logger":"provisioner.ironic","msg":"error caught while checking endpoint","host":"openshift-machine-api~e38-h03-000-r650","endpoint":"https://localhost:6385/v1/","error":"Get \"https://localhost:6385/v1\": remote error: tls: handshake failure"}
      {"level":"info","ts":1690305943.9706235,"logger":"controllers.BareMetalHost","msg":"provisioner is not ready","baremetalhost":"openshift-machine-api/e38-h03-000-r650","RequeueAfter:":30}

      Ironic Proxy Logs:

      # oc logs -n openshift-machine-api ironic-proxy-dh8w6
      ...
      [Tue Jul 25 17:28:43.880937 2023] [ssl:info] [pid 31:tid 81] [client ::1:41916] AH01998: Connection closed to child 78 with abortive shutdown (server fc00-1005--6.kubelet.kube-system.svc.cluster.local:6385)
      [Tue Jul 25 17:28:44.011813 2023] [ssl:info] [pid 32:tid 97] [client ::1:41930] AH01964: Connection to child 146 established (server fc00-1005--6.kubelet.kube-system.svc.cluster.local:6385)
      [Tue Jul 25 17:28:44.011888 2023] [ssl:debug] [pid 32:tid 97] ssl_engine_kernel.c(2399): [client ::1:41930] AH02044: No matching SSL virtual host for servername localhost found (using default/first virtual host)
      [Tue Jul 25 17:28:44.023870 2023] [ssl:info] [pid 32:tid 97] [client ::1:41930] AH02008: SSL library error 1 in handshake (server fc00-1005--6.kubelet.kube-system.svc.cluster.local:6385)
      [Tue Jul 25 17:28:44.023880 2023] [ssl:info] [pid 32:tid 97] SSL Library Error: error:0A08010C:SSL routines::unsupported
      [Tue Jul 25 17:28:44.023884 2023] [ssl:info] [pid 32:tid 97] [client ::1:41930] AH01998: Connection closed to child 146 with abortive shutdown (server fc00-1005--6.kubelet.kube-system.svc.cluster.local:6385)
      [Tue Jul 25 17:28:44.038546 2023] [ssl:info] [pid 114:tid 116] [client ::1:41934] AH01964: Connection to child 192 established (server fc00-1005--6.kubelet.kube-system.svc.cluster.local:6385)
      [Tue Jul 25 17:28:44.038627 2023] [ssl:debug] [pid 114:tid 116] ssl_engine_kernel.c(2399): [client ::1:41934] AH02044: No matching SSL virtual host for servername localhost found (using default/first virtual host)
      [Tue Jul 25 17:28:44.050843 2023] [ssl:info] [pid 114:tid 116] [client ::1:41934] AH02008: SSL library error 1 in handshake (server fc00-1005--6.kubelet.kube-system.svc.cluster.local:6385)
      [Tue Jul 25 17:28:44.050852 2023] [ssl:info] [pid 114:tid 116] SSL Library Error: error:0A08010C:SSL routines::unsupported
      [Tue Jul 25 17:28:44.050856 2023] [ssl:info] [pid 114:tid 116] [client ::1:41934] AH01998: Connection closed to child 192 with abortive shutdown (server fc00-1005--6.kubelet.kube-system.svc.cluster.local:6385)

       

       

        1. ironic-proxy-dh8w6.log
          961 kB
          Alex Krzos
        2. metal3-7bd76ff468-g54gz.bmo.log
          398 kB
          Alex Krzos
        3. ocpbugs-16764.tar.gz
          25.21 MB
          Alex Krzos

            tsedovic@redhat.com Tomas Sedovic
            akrzos@redhat.com Alex Krzos
            Pedro Jose Amoedo Martinez Pedro Jose Amoedo Martinez
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: