Uploaded image for project: 'OpenShift Hive'
  1. OpenShift Hive
  2. HIVE-2565

AWS PrivateLink Cluster Failed Installation with CAPI on 4.16+

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • None
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • None
    • None
    • None
    • None
    • None
    • None

      Description: 

      The AWS PrivateLink cluster cannot be installed successfully, because starting from version 4.16, the OCP installation via CAPI,  changes the security groups from the original groups [infraId-master-sg, infraId-worker-sg] to [infraId-node, infraId-lb, infraId-apiserver-lb, infraId-controlplane].

      security groups via terraform:

      • mihuang711a-p2x9x-master-sg
      • mihuang711a-p2x9x-worker-sg
      • Security group for Kubernetes ELB xxxxxxxx (openshift-ingress/router-default)

      security groups via CAPI: 

      • mihuang711c-dxgrm-node
      • mihuang711c-dxgrm-lb
      • mihuang711c-dxgrm-controlplane
      • mihuang711c-dxgrm-apiserver-lb
      • Security group for Kubernetes ELB xxxxxxxx (openshift-ingress/router-default)

      In the Hive code, when using hiveutil or manually to generate the AWS PrivateLink needed resources, it configures the security groups by retrieving the worker security groups (Values: []*string{aws.String(infraID + "-worker-sg")}). This method is suitable for versions 4.15 and befor. For versions 4.16+ via CAPI, it cannot retrieve any SG.

       

      Version:

      4.16+

      How reproducible:

      Always

      Steps to Reproduce:

      1. Using `hiveutil` or manually to generate the AWS PrivateLink needed resources, you cannot configure the security group. This causes the AWS PrivateLink cluster installation to fail.

      Actually results:

      1.Cannot configure the security group.

      ./bin/hiveutil awsprivatelink endpointvpc add $endpointVPC2 --region us-east-2 --subnet-ids $endpointVPC2Subnets -d 
      …
      FATA[0016] Failed to get worker SG of the associated VPC  error="default SG not found for VPC 0xc000b46010"
      
      

      2.Configuring resources in the original way, then installing the cluster, results in the following error.

      time="2024-07-11T06:05:59Z" level=debug msg="E0711 06:05:59.556306      96 controller.go:329] \"Reconciler error\" err=\"expected at least 1 public subnet but got 0\" controller=\"awscluster\" controllerGroup=\"infrastructure.cluster.x-k8s.io\" controllerKind=\"AWSCluster\" AWSCluster=\"openshift-cluster-api-guests/mihuanghive416-89h2v\" namespace=\"openshift-cluster-api-guests\" name=\"mihuanghive416-89h2v\" reconcileID=\"2c0ccbb6-2f19-481d-b5a2-ec362ff31e0e\""

      Expected results:

      1.Successfully configure the security group. (This result is from a 4.15 cluster as an example.)

      $  ./bin/hiveutil awsprivatelink endpointvpc add $endpointVPC --region us-east-2 --subnet-ids $endpointVPCSubnets -d
      …
      DEBU[0011] Found worker SG sg-00df18cfb1aa56983 of the associated Hive cluster 
      INFO[0011] Authorizing traffic from the associated VPC's worker SG to the endpoint VPC's default SG 
      INFO[0012] Authorizing traffic from the endpoint VPC's default SG to the associated VPC's worker SG 
      INFO[0012] Adding endpoint VPC vpc-0133317c4cfe4168c to HiveConfig 
      DEBU[0013] Endpoint VPC added to HiveConfig             
      
      
      

       

      2. AWS Private Link cluster successfully installed for version 4.16 and later.

      Note: We didn't encounter such issue when install the private clusters on version 4.15 and before.
       
      Collect logs from the hive-controller pod and the 4.16 provisioning pod.

      hive-controllers-85f8c8cb67-fvctb.log

      mihuanghive416-0-8bnq5-provision-lbjnz.log

              jstuever@redhat.com Jeremiah Stuever
              mihuang@redhat.com Mingxia Huang
              None
              Jianping Shu
              None
              Mingxia Huang Mingxia Huang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: