-
Bug
-
Resolution: Cannot Reproduce
-
Undefined
-
None
-
4.12
-
No
-
False
-
Description of problem:
ARO SRE team is facing an issue with the Azure Red Hat OpenShift 4.12.25 installation. for which installer wrapper ARO-installer(https://github.com/openshift/ARO-Installer/tree/release-4.12) is being used. Eventually the cluster is getting to up and running state and working fine, However our ARO installer does some configuration (for example: Installing ARO Operator in the cluster). While these ARO installer things are running, it fails with below error: ~~~~ Post "/namespaces/openshift-config/secrets": dial tcp api-int-xxxx-xxx:6443: i/o timeout ~~~~ Tracker JIRA for this issue in ARO project: https://issues.redhat.com/browse/ARO-4306
Version-Release number of selected component (if applicable):
4.12.25
How reproducible:
Sometimes, not guaranteed
Steps to Reproduce:
1. 2. 3.
Actual results:
Post "/namespaces/openshift-config/secrets": dial tcp api-int-xxxx-xxx:6443: i/o timeout
Expected results:
api-int should respond properly
Additional info:
- The service which is trying to access api-int is not in-cluster pod - It’s our service ARO-Resource Provider(https://github.com/Azure/ARO-RP) basically ARO cluster provider (managed by ARO SREs) which talks to api-int via a `Private Link Service` Configured in the cluster resource group in Azure - This service does creation and provision of resources in azure resource group and installs ARO - OpenShift Cluster in azureThe cluster gets installed properly in the backend and we can hack our way to get a kubeconfig and login.But in some Day-2 tasks this service fails to interact to API-int due to above mentioned timeout issue - We tried to see from the Kube-API logs there were some API readyz check failed events around the bootstrap VM removal stage in the cluster installation - Could see these check fails were popping there in some successful installation as well, however in the failed ones those are appearing for few more minutes