Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36398

4.16 Control Plane nodes fail to install with OCP 4.16

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • 4.16.0
    • Networking / DPU
    • None
    • Critical
    • No
    • NHE Sprint 256
    • 1
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      When attempting to install a 4.16 cluster via assisted installer (podman based) the installation fails to start the control plane. This issue is not present when installing 4.15/4.14 nightly.

          (ocp-venv) [root@wsfd-advnetlab227 cluster-deployment-automation]# aicli -U 0.0.0.0:8090 list hosts
      +-------------------------+--------------------------------------+----------------+-----------------------+--------------------------------+-------------------+---------------+
      |           Host          |                  Id                  |    Cluster     |        Infraenv       |             Status             |        Role       |       Ip      |
      +-------------------------+--------------------------------------+----------------+-----------------------+--------------------------------+-------------------+---------------+
      | nicmodecluster-master-1 | 22c1c9b1-3e2e-4af3-8407-b013cf183715 | nicmodecluster | nicmodecluster-x86_64 | installing-pending-user-action |       master      | 192.168.122.2 |
      | nicmodecluster-master-2 | b29f2a84-c63e-4198-82da-be29b81ab278 | nicmodecluster | nicmodecluster-x86_64 |     installing-in-progress     | master(bootstrap) | 192.168.122.3 |
      | nicmodecluster-master-3 | d2c800c9-ddef-4a00-9449-d46f429b0f91 | nicmodecluster | nicmodecluster-x86_64 | installing-pending-user-action |       master      | 192.168.122.4 |
      +-------------------------+--------------------------------------+----------------+-----------------------+--------------------------------+-------------------+---------------+
      
      
      aicli -U 0.0.0.0:8090 list events nicmodecluster
      ...
      | 2024-07-01T15:52:04.703Z |                                                                                                                                                                          Host: nicmodecluster-master-1, reached installation stage Writing image to disk: 100%                                                                                                                                                                           |
      | 2024-07-01T15:52:05.198Z |                                                                                                                                                                          Host: nicmodecluster-master-3, reached installation stage Writing image to disk: 100%                                                                                                                                                                           |
      | 2024-07-01T15:52:05.583Z |                                                                                                                                                                           Host: nicmodecluster-master-2, reached installation stage Writing image to disk: 96%                                                                                                                                                                           |
      | 2024-07-01T15:52:06.159Z |                                                                                                                                                                          Host: nicmodecluster-master-2, reached installation stage Writing image to disk: 100%                                                                                                                                                                           |
      | 2024-07-01T15:53:41.631Z |                                                                                                                                                                       Uploaded logs for host nicmodecluster-master-1 cluster 751c739a-65bf-4bbd-ac6b-55499b2982da                                                                                                                                                                        |
      | 2024-07-01T15:53:41.681Z |                                                                                                                                                                                   Host: nicmodecluster-master-1, reached installation stage Rebooting                                                                                                                                                                                    |
      | 2024-07-01T15:53:42.286Z |                                                                                                                                                                       Uploaded logs for host nicmodecluster-master-3 cluster 751c739a-65bf-4bbd-ac6b-55499b2982da                                                                                                                                                                        |
      | 2024-07-01T15:53:42.339Z |                                                                                                                                                                                   Host: nicmodecluster-master-3, reached installation stage Rebooting                                                                                                                                                                                    |
      | 2024-07-01T15:53:48.850Z |                                                                                                                                                       Host: nicmodecluster-master-2, reached installation stage Waiting for control plane: Waiting for bootstrap node preparation                                                                                                                                                        |
      | 2024-07-01T15:53:48.855Z |                                                                                                                                                 Host: nicmodecluster-master-2, reached installation stage Waiting for control plane: Waiting for masters to join bootstrap control plane                                                                                                                                                 |
      | 2024-07-01T16:35:03.941Z | Host nicmodecluster-master-3: updated status from installing-in-progress to installing-pending-user-action (Host timed out when pulling the configuration files. Verify in the host console that the host boots from the OpenShift installation disk (vda, /dev/disk/by-path/pci-0000:04:00.0) and has network access to the cluster API. The installation will resume after the host successfully boots and can access the cluster API) |
      | 2024-07-01T16:35:03.943Z | Host nicmodecluster-master-1: updated status from installing-in-progress to installing-pending-user-action (Host timed out when pulling the configuration files. Verify in the host console that the host boots from the OpenShift installation disk (vda, /dev/disk/by-path/pci-0000:04:00.0) and has network access to the cluster API. The installation will resume after the host successfully boots and can access the cluster API) |
      | 2024-07-01T16:35:07.939Z |                                                                                                                                                                                     Updated status of the cluster to installing-pending-user-action                                                                                                                                                                                      |
      +--------------------------+-------------------------------------------------------------------------------------------

      I am unable to ping the master nodes at 192.168.122.2 and 192.168.122.4, it appears we lose connectivity after the installation.

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          Everytime

      Steps to Reproduce:

          1. Attempt to deploy 4.16 cluster via Assisted Installer
          2.
          3.
          

      Actual results:

      Bootstrap node hangs at "installing-in-progress"
      Other masters hand at "installing-pending-user-action"    

      Expected results:

          Masters successfully install, cluster is started without issue

      Additional info:

      configmap.yml

      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: config
      data:
        ASSISTED_SERVICE_HOST: 127.0.0.1:8090
        ASSISTED_SERVICE_SCHEME: http
        AUTH_TYPE: none
        DB_HOST: 127.0.0.1
        DB_NAME: installer
        DB_PASS: admin
        DB_PORT: '5432'
        DB_USER: admin
        DEPLOY_TARGET: onprem
        DEPLOYMENT_TYPE: Podman
        DISK_ENCRYPTION_SUPPORT: 'true'
        DUMMY_IGNITION: 'false'
        ENABLE_SINGLE_NODE_DNSMASQ: 'true'
        HW_VALIDATOR_REQUIREMENTS: '[{"version": "default", "master": {"cpu_cores": 4, "ram_mib":
          16384, "disk_size_gb": 8, "installation_disk_speed_threshold_ms": 10, "network_latency_threshold_ms":
          100, "packet_loss_percentage": 0}, "worker": {"cpu_cores": 2, "ram_mib": 8192,
          "disk_size_gb": 8, "installation_disk_speed_threshold_ms": 10, "network_latency_threshold_ms":
          1000, "packet_loss_percentage": 10}, "sno": {"cpu_cores": 8, "ram_mib": 16384,
          "disk_size_gb": 8, "installation_disk_speed_threshold_ms": 10}, "edge-worker":
          {"cpu_cores": 2, "ram_mib": 8192, "disk_size_gb": 15, "installation_disk_speed_threshold_ms":
          10}}]'
        IMAGE_SERVICE_BASE_URL: http://192.168.122.1:8888
        IPV6_SUPPORT: 'true'
        ISO_IMAGE_TYPE: full-iso
        LISTEN_PORT: '8888'
        NTP_DEFAULT_SERVER: ''
        OS_IMAGES: '[{"openshift_version":"4.9","cpu_architecture":"x86_64","url":"https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.9/4.9.45/rhcos-4.9.45-x86_64-live.x86_64.iso","version":"49.84.202207192205-0"},{"openshift_version":"4.10","cpu_architecture":"x86_64","url":"https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.10/4.10.37/rhcos-4.10.37-x86_64-live.x86_64.iso","version":"410.84.202210040010-0"},{"openshift_version":"4.10","cpu_architecture":"arm64","url":"https://mirror.openshift.com/pub/openshift-v4/aarch64/dependencies/rhcos/4.10/4.10.37/rhcos-4.10.37-aarch64-live.aarch64.iso","version":"410.84.202210040011-0"},{"openshift_version":"4.11","cpu_architecture":"x86_64","url":"https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.11/4.11.48/rhcos-4.11.48-x86_64-live.x86_64.iso","version":"411.86.202308081056-0"},{"openshift_version":"4.11","cpu_architecture":"arm64","url":"https://mirror.openshift.com/pub/openshift-v4/aarch64/dependencies/rhcos/4.11/4.11.48/rhcos-4.11.48-aarch64-live.aarch64.iso","version":"411.86.202308081056-0"},{"openshift_version":"4.11","cpu_architecture":"s390x","url":"https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/4.11/4.11.48/rhcos-4.11.48-s390x-live.s390x.iso","version":"411.86.202308081056-0"},{"openshift_version":"4.11","cpu_architecture":"ppc64le","url":"https://mirror.openshift.com/pub/openshift-v4/ppc64le/dependencies/rhcos/4.11/4.11.48/rhcos-4.11.48-ppc64le-live.ppc64le.iso","version":"411.86.202308081056-0"},{"openshift_version":"4.12","cpu_architecture":"x86_64","url":"https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.12/4.12.30/rhcos-4.12.30-x86_64-live.x86_64.iso","version":"412.86.202308081039-0"},{"openshift_version":"4.12","cpu_architecture":"arm64","url":"https://mirror.openshift.com/pub/openshift-v4/aarch64/dependencies/rhcos/4.12/4.12.30/rhcos-4.12.30-aarch64-live.aarch64.iso","version":"412.86.202308081039-0"},{"openshift_version":"4.12","cpu_architecture":"s390x","url":"https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/4.12/4.12.30/rhcos-4.12.30-s390x-live.s390x.iso","version":"412.86.202308081039-0"},{"openshift_version":"4.12","cpu_architecture":"ppc64le","url":"https://mirror.openshift.com/pub/openshift-v4/ppc64le/dependencies/rhcos/4.12/4.12.30/rhcos-4.12.30-ppc64le-live.ppc64le.iso","version":"412.86.202308081039-0"},{"openshift_version":"4.13","cpu_architecture":"x86_64","url":"https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.13/4.13.10/rhcos-4.13.10-x86_64-live.x86_64.iso","version":"413.92.202307260246-0"},{"openshift_version":"4.13","cpu_architecture":"arm64","url":"https://mirror.openshift.com/pub/openshift-v4/aarch64/dependencies/rhcos/4.13/4.13.10/rhcos-4.13.10-aarch64-live.aarch64.iso","version":"413.92.202307260246-0"},{"openshift_version":"4.13","cpu_architecture":"ppc64le","url":"https://mirror.openshift.com/pub/openshift-v4/ppc64le/dependencies/rhcos/4.13/4.13.10/rhcos-4.13.10-ppc64le-live.ppc64le.iso","version":"413.92.202307260246-0"},{"openshift_version":"4.13","cpu_architecture":"s390x","url":"https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/4.13/4.13.10/rhcos-4.13.10-s390x-live.s390x.iso","version":"413.92.202307260246-0"},{"openshift_version":"4.14","cpu_architecture":"x86_64","url":"https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.14/4.14.15/rhcos-4.14.15-x86_64-live.x86_64.iso","version":"414.92.202402130420-0"},{"openshift_version":"4.14","cpu_architecture":"arm64","url":"https://mirror.openshift.com/pub/openshift-v4/aarch64/dependencies/rhcos/4.14/4.14.15/rhcos-4.14.15-aarch64-live.aarch64.iso","version":"414.92.202402130420-0"},{"openshift_version":"4.14","cpu_architecture":"ppc64le","url":"https://mirror.openshift.com/pub/openshift-v4/ppc64le/dependencies/rhcos/4.14/4.14.15/rhcos-4.14.15-ppc64le-live.ppc64le.iso","version":"414.92.202402130420-0"},{"openshift_version":"4.14","cpu_architecture":"s390x","url":"https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/4.14/4.14.15/rhcos-4.14.15-s390x-live.s390x.iso","version":"414.92.202402130420-0"},{"openshift_version":"4.15","cpu_architecture":"x86_64","url":"https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.15/4.15.0/rhcos-4.15.0-x86_64-live.x86_64.iso","version":"415.92.202402130021-0"},{"openshift_version":"4.15","cpu_architecture":"arm64","url":"https://mirror.openshift.com/pub/openshift-v4/aarch64/dependencies/rhcos/4.15/4.15.0/rhcos-4.15.0-aarch64-live.aarch64.iso","version":"415.92.202402130021-0"},{"openshift_version":"4.15","cpu_architecture":"ppc64le","url":"https://mirror.openshift.com/pub/openshift-v4/ppc64le/dependencies/rhcos/4.15/4.15.0/rhcos-4.15.0-ppc64le-live.ppc64le.iso","version":"415.92.202402130021-0"},{"openshift_version":"4.15","cpu_architecture":"s390x","url":"https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/4.15/4.15.0/rhcos-4.15.0-s390x-live.s390x.iso","version":"415.92.202402130021-0"},{"openshift_version":"4.16","cpu_architecture":"x86_64","url":"https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.16/4.16.0/rhcos-4.16.0-x86_64-live.x86_64.iso","version":"416.94.202405291527-0"},{"openshift_version":"4.16","cpu_architecture":"arm64","url":"https://mirror.openshift.com/pub/openshift-v4/aarch64/dependencies/rhcos/4.16/4.16.0/rhcos-4.16.0-aarch64-live.aarch64.iso","version":"416.94.202405291527-0"},{"openshift_version":"4.16","cpu_architecture":"ppc64le","url":"https://mirror.openshift.com/pub/openshift-v4/ppc64le/dependencies/rhcos/4.16/4.16.0/rhcos-4.16.0-ppc64le-live.ppc64le.iso","version":"416.94.202405291527-0"},{"openshift_version":"4.16","cpu_architecture":"s390x","url":"https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/4.16/4.16.0/rhcos-4.16.0-s390x-live.s390x.iso","version":"416.94.202405291527-0"},{"openshift_version":"4.17","cpu_architecture":"x86_64","url":"https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/pre-release/4.16.0-rc.4/rhcos-4.16.0-rc.4-x86_64-live.x86_64.iso","version":"416.94.202405291527-0"},{"openshift_version":"4.17","cpu_architecture":"arm64","url":"https://mirror.openshift.com/pub/openshift-v4/aarch64/dependencies/rhcos/pre-release/4.16.0-rc.4/rhcos-4.16.0-rc.4-aarch64-live.aarch64.iso","version":"416.94.202405291527-0"},{"openshift_version":"4.17","cpu_architecture":"ppc64le","url":"https://mirror.openshift.com/pub/openshift-v4/ppc64le/dependencies/rhcos/pre-release/4.16.0-rc.4/rhcos-4.16.0-rc.4-ppc64le-live.ppc64le.iso","version":"416.94.202405291527-0"},{"openshift_version":"4.17","cpu_architecture":"s390x","url":"https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/pre-release/4.16.0-rc.4/rhcos-4.16.0-rc.4-s390x-live.s390x.iso","version":"416.94.202405291527-0"}]'
        POSTGRESQL_DATABASE: installer
        POSTGRESQL_PASSWORD: admin
        POSTGRESQL_USER: admin
        PUBLIC_CONTAINER_REGISTRIES: quay.io
        RELEASE_IMAGES: '[{"openshift_version": "4.16-multi", "cpu_architectures": ["x86_64",
          "arm64", "ppc64le", "s390x"], "url": "quay.io/openshift-release-dev/ocp-release-nightly@sha256:7551b8f3cacfe332eefd4baf13b2304fbee4905178ec8d200b459cacadd13c48",
          "version": "4.16.0-nightly", "cpu_architecture": "multi", "support_level": "beta"}]'
        SERVICE_BASE_URL: http://192.168.122.1:8090
        STORAGE: filesystem
        ENABLE_UPGRADE_AGENT: 'true'
        INSTALLER_IMAGE: registry.redhat.io/rhai-tech-preview/assisted-installer-rhel8:v1.0.0-306
        CONTROLLER_IMAGE: registry.redhat.io/rhai-tech-preview/assisted-installer-reporter-rhel8:v1.0.0-383
        AGENT_DOCKER_IMAGE: registry.redhat.io/rhai-tech-preview/assisted-installer-agent-rhel8:v1.0.0-295
      

      pod-persistent.yml

      apiVersion: v1
      kind: Pod
      metadata:
        labels:
          app: assisted-installer
        name: assisted-installer
      spec:
        containers:
        - args:
          - run-postgresql
          envFrom:
          - configMapRef:
              name: config
          image: quay.io/sclorg/postgresql-12-c8s:latest
          name: db
          volumeMounts:
          - mountPath: /var/lib/pgsql
            name: pg-data
        - envFrom:
          - configMapRef:
              name: config
          image: quay.io/edge-infrastructure/assisted-installer-ui:v2.29.0
          name: ui
          ports:
          - hostPort: 8080
        - envFrom:
          - configMapRef:
              name: config
          image: quay.io/edge-infrastructure/assisted-image-service:v2.29.0
          name: image-service
          ports:
          - hostPort: 8888
        - envFrom:
          - configMapRef:
              name: config
          image: quay.io/edge-infrastructure/assisted-service:v2.29.0
          name: service
          ports:
          - hostPort: 8090
          volumeMounts:
          - mountPath: /data
            name: ai-data
        restartPolicy: Never
        volumes:
        - name: ai-data
          persistentVolumeClaim:
            claimName: ai-service-data
        - name: pg-data
          persistentVolumeClaim:
            claimName: ai-db-data
       

       

       

              bnemeth@redhat.com Balazs Nemeth
              sdaniele@redhat.com Salvatore Daniele
              Michael Burman Michael Burman
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: