Uploaded image for project: 'Red Hat OpenShift Control Planes'
  1. Red Hat OpenShift Control Planes
  2. CNTRLPLANE-2254

Impact HCP cluster upgrade to 4.19 from 4.18.26 creates a service of type LoadBalancer with name router, thereby blocking the upgrade

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • None
    • HyperShift
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • None
    • None

      Impact statement for the OCPBUGS-69866 series:

       Which 4.y.z to 4.y'.z' updates increase vulnerability?

      Customers upgrading from any 4.18 or 4.19 to 4.19.18 or later, or to any 4.20. Use oc adm upgrade to show your current cluster version.

      Which types of clusters?

      Hosted Control Plane (HyperShift) clusters meeting ALL of the following conditions:

      1. Platform: Bare metal and KubeVirt where no IP addresses are available for allocation, e.g., the IPAddressPool is exhausted or has autoAssign: false
      2. HostedCluster configuration: At least one service in the HostedCluster resource spec.services has:
        • servicePublishingStrategy.type: Route
        • AND servicePublishingStrategy.route.hostname is set to a subdomain of the management cluster's .apps domain

      Using oc command, check your HostedCluster via:

      oc get hostedcluster -n <namespace> <name> -o jsonpath='{.spec.services}'
      

      Look for a service entry like:

      - service: <Service Name>
        servicePublishingStrategy: 
          type: Route
          route: 
            hostname: example-serivec.apps.<management-cluster-domain>
      

      What is the impact? Is it serious enough to warrant removing update recommendations?

      HCP cluster upgrade stalls immediately after triggering the upgrade and the following error will be observed in the HostedCluster conditions:

      router load balancer is not provisioned: Failed to allocate IP for "<cluster_name>-<cluster_name>/router": no available IPs
      

      How involved is remediation?

      Workaround options:

      For 4.20.0+ clusters: Provide an IP address to be assigned for the router service.

      For 4.19 z clusters starting 4.19.18:

      1. (Preferred) downgrade to 4.19.17 via the following steps:
        1. Update the HostedCluster to 4.19.17. The document can be referred for upgrading the hosted cluster.
        2. Ensure that the majority of the hosted-control-plane 4.19.17 pods finish the rollout.
        3. You will still find a service resource with the name router lingering in the hosted-control-plane namespace.
        4. Manually delete the following resources in the hosted-control-plane namespace:
          1. A service resource with the name router.
          2. The following three route resources:
            1. oauth
            2. konnectivity-server
            3. ignition-server
        5. Verify that the three deleted resources have been re-created successfully.
        6. Check the HostedCluster resource to confirm that the cluster is upgraded successfully.
      2. If downgrading to 4.19.17 is not possible, then you can provide an IP address to be assigned for the router service.

      Is this a regression?

      Yes, from 4.19.18. A fix is being developed.

              rh-ee-aabdelre Ahmed Abdalla Abdelrehim
              trking W. Trevor King
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: