-
Bug
-
Resolution: Done
-
Critical
-
None
-
None
-
None
-
False
-
None
-
False
-
-
-
Critical
-
No
Description of problem:
The issue is for listing all issues we encountered when provisioning ROSA HCP cluster via CAPI
1. AWS account aws-acm-dev11 doesn't have enough permission for provisioning ROSA HCP clusters via CAPI. There are some typical errors we encountered
"kind": "Error", "id": "400", "href": "/api/clusters_mgmt/v1/errors/400", "code": "CLUSTERS-MGMT-400", "reason": "BillingMarketplaceAccount 999569342541 not linked to organization 1H1PQMDtwzAUsjPxgoWRjhSpNGD at the aws marketplace", "operation_id": "84fe6b7a-180c-4bc1-ac9b-ff54301e6bf0"
2024/02/02 21:35:17 { "kind": "Error", "id": "400", "href": "/api/clusters_mgmt/v1/errors/400", "code": "CLUSTERS-MGMT-400", "reason": "Failed to assume role with ARN 'arn:aws:iam::999569342541:role/melserng-HCP-ROSA-Installer-Role': failed to assume role: operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 31b71bd1-12be-484d-8227-e0abbf8694d4, api error AccessDenied: User: arn:aws:sts::644306948063:assumed-role/RH-Managed-OpenShift-Installer/OCM is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::999569342541:role/melserng-HCP-ROSA-Installer-Role", "details": [ { "Error_Key": "AWSAssumeRole" } ], "operation_id": "1745ebaa-5c24-4771-ae7a-1f8f907959c4" }
2. suggest to provision ROSA HCP clusters under the OCM staging env
% ocm login --url https://api.stage.openshift.com --token='<token>'
3. When using `clusterctl init --infrastructure aws` to transform the kind cluster to the management cluster, the capa-controller-manager pod is only for provisioning standard AWS clusters. ROSA mode is not enabled by default.
4. Rosa operator manifest role yaml typo - rosamachinenepools. It caused the rosa controller failed to list rosamachinepools resources
5. The networkARN is actually truncated in AWS because of the length limit. The fix is just a temporary workaround. Have to make sure the networkARN name is in AWS
For issue 4 and 5, here is the PR for the ref
https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/4742/files
6. the initial rosa manifest template is generated by this way. It also creates some dependency paths used in image building. So it has to be done before building the new image
% RELEASE_TAG="e2e" make release-manifests
7. Need to locally build and deploy the new envsubst tool (Not the GNU envsubst)
The new envsubst tool extends the env var substitution capability
8. rosa create account-roles and operator-roles
The role prefix has to be unique. Or the creation of the account-roles could fail
9. After the ROSA cluster is deleted, we also need to
- manually delete all the dependent resources in the AWS (VPC, subnets, ...)
- manually release the created elastic IPs. This is not released automatically along with VPC deletion.
For more details, please refer to the doc where all of the above issues and relative workarounds are discussed
https://docs.google.com/document/d/18jtqQy2CEYzmxJoBr2dcnZBtHY62hL8juhO_XIUsA64/edit
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
- ...
Actual results:
Expected results:
Additional info:
- relates to
-
HOSTEDCP-1271 [Upstream MVP-1] ROSA implementation for Cluster API
- Closed