-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.17.z, 4.18
-
None
Description of problem:
In integration, creating a rosa HostedCluster with a shared vpc will result in a VPC endpoint that is not available.
Version-Release number of selected component (if applicable):
4.17.3
How reproducible:
Sometimes (currently every time in integration, but could be due to timing)
Steps to Reproduce:
1. Create a HostedCluster with shared VPC 2. Wait for HostedCluster to come up
Actual results:
VPC endpoint never gets created due to errors like: {"level":"error","ts":"2024-11-18T20:37:51Z","msg":"Reconciler error","controller":"awsendpointservice","controllerGroup":"hypershift.openshift.io","controllerKind":"AWSEndpointService","AWSEndpointService":{"name":"private-router","namespace":"ocm-int-2f4labdgi2grpumbq5ufdsfv7nv9ro4g-cse2etests-gdb"},"namespace":"ocm-int-2f4labdgi2grpumbq5ufdsfv7nv9ro4g-cse2etests-gdb","name":"private-router","reconcileID":"bc5d8a6c-c9ad-4fc8-8ead-6b6c161db097","error":"failed to create vpc endpoint: UnauthorizedOperation","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:261\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222"}
Expected results:
VPC endpoint gets created
Additional info:
Deleting the control plane operator pod will get things working. The theory is that if the control plane operator pod is delayed in obtaining a web identity token, then the client will not assume the role that was passed to it. Currently the client is only created once at the start, we should create it on every reconcile.
- blocks
-
OCPBUGS-45184 Shared VPC: AWS client fails to assume role when token creation is delayed
- ON_QA
- is cloned by
-
OCPBUGS-45184 Shared VPC: AWS client fails to assume role when token creation is delayed
- ON_QA
- relates to
-
OCPSTRAT-1588 Shared-VPC for Hypershift
- In Progress
- links to
-
RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update