-
Bug
-
Resolution: Done-Errata
-
Normal
-
None
-
4.16.z
Description of problem:
AWS VPCs support a primary CIDR range and multiple secondary CIDR ranges: https://aws.amazon.com/about-aws/whats-new/2017/08/amazon-virtual-private-cloud-vpc-now-allows-customers-to-expand-their-existing-vpcs/
Let's pretend a VPC exists with:
- Primary CIDR range: 10.0.0.0/24 (subnet-a)
- Seconday CIDR range: 10.1.0.0/24 (subnet-b)
and a hostedcontrolplane object like:
networking: ... machineNetwork: - cidr: 10.1.0.0/24 ... olmCatalogPlacement: management platform: aws: cloudProviderConfig: subnet: id: subnet-b vpc: vpc-069a93c6654464f03
Even though all EC2 instances will be spun up in subnet-b (10.1.0.0/24), CPO will detect the CIDR range of the VPC as 10.0.0.0/24 (https://github.com/openshift/hypershift/blob/0d10c822912ed1af924e58ccb8577d2bb1fd68be/control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go#L4755-L4765) and create security group rules only allowing inboud traffic from 10.0.0.0/24. This specifically prevents these EC2 instances from communicating with the VPC Endpoint created by the awsendpointservice CR and reading the hosted control plane pods.
Version-Release number of selected component (if applicable):
Reproduced on a 4.14.20 ROSA HCP cluster, but the version should not matter
How reproducible:
100%
Steps to Reproduce:
1. Create a VPC with at least one secondary CIDR block 2. Install a ROSA HCP cluster providing the secondary CIDR block as the machine CIDR range and selecting the appropriate subnets within the secondary CIDR range
Actual results:
* Observe that the default security group contains inbound security group rules allowing traffic from the VPC's primary CIDR block (not a CIDR range containing the cluster's worker nodes) * As a result, the EC2 instances (worker nodes) fail to reach the ignition-server
Expected results:
The EC2 instances are able to reach the ignition-server and HCP pods
Additional info:
This bug seems like it could be fixed by using the machine CIDR range for the security group instead of the VPC CIDR range. Alternatively, we could duplicate rules for every secondary CIDR block, but the default AWS quota is 60 inbound security group rules/security group, so it's another failure condition to keep in mind if we go that route.
aws ec2 describe-vpcs output for a VPC with secondary CIDR blocks: ❯ aws ec2 describe-vpcs --region us-east-2 --vpc-id vpc-069a93c6654464f03 { "Vpcs": [ { "CidrBlock": "10.0.0.0/24", "DhcpOptionsId": "dopt-0d1f92b25d3efea4f", "State": "available", "VpcId": "vpc-069a93c6654464f03", "OwnerId": "429297027867", "InstanceTenancy": "default", "CidrBlockAssociationSet": [ { "AssociationId": "vpc-cidr-assoc-0abbc75ac8154b645", "CidrBlock": "10.0.0.0/24", "CidrBlockState": { "State": "associated" } }, { "AssociationId": "vpc-cidr-assoc-098fbccc85aa24acf", "CidrBlock": "10.1.0.0/24", "CidrBlockState": { "State": "associated" } } ], "IsDefault": false, "Tags": [ { "Key": "Name", "Value": "test" } ] } ] }
- blocks
-
OCPBUGS-35056 AWS - CPO can use incorrect CIDR range on the default worker security group
- Closed
- is cloned by
-
OCPBUGS-35056 AWS - CPO can use incorrect CIDR range on the default worker security group
- Closed
- links to
-
RHEA-2024:3718 OpenShift Container Platform 4.17.z bug fix update