-
Bug
-
Resolution: Cannot Reproduce
-
Undefined
-
None
-
4.13, 4.14
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
No
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Workers stuck in provisioned for ipi baremetal installation
Version-Release number of selected component (if applicable):
Cluster version 4.14- 4.14.0-0.nightly-2023-04-03-211601 and 4.13[4.13.0-0.nightly-2023-04-15-102029]
How reproducible:
Always
Steps to Reproduce:
Steps:1.Install baremetal ipi cluster Expected : cluster installation successful.Actual : Workers are getting stuck in provisioned status , csrs are pending Pending csrs 4.13 [miyadav@miyadav 413]$ oc get csrNAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITIONcsr-6jxmh 22m kubernetes.io/kubelet-serving system:node:localhost.localdomain <none> Pendingcsr-8rmsw 7m10s kubernetes.io/kubelet-serving system:node:localhost.localdomain <none> Pendingcsr-9gq74 40m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issuedcsr-bjl89 22m kubernetes.io/kubelet-serving system:node:localhost.localdomain <none> Pendingcsr-clv7n 56m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issuedcsr-dkrfg 26m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issuedcsr-dmfrn 42m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issuedcsr-dwnql 25m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issuedcsr-hlrl6 57m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issuedcsr-sbvgt 7m11s kubernetes.io/kubelet-serving system:node:localhost.localdomain <none> Pending[miyadav@miyadav 413]$ oc get machinesNo resources found in openshift-cluster-machine-approver namespace.[miyadav@miyadav 413]$ oc get machines -n openshift-machine-apiNAME PHASE TYPE REGION ZONE AGEmiyadav-1704v1-5cqtm-master-0 Running 129mmiyadav-1704v1-5cqtm-master-1 Running 129mmiyadav-1704v1-5cqtm-master-2 Running 129mmiyadav-1704v1-5cqtm-worker-0-b7sq4 Running 101mmiyadav-1704v1-5cqtm-worker-0-ll2dt Provisioned 101m[miyadav@miyadav 413]$ 4.13 cluster machine approver logs :- I0417 04:55:09.622487 1 csr_check.go:211] Could not use Machine for serving cert authorization: DNS name 'localhost.localdomain' not in machine names: openshift-qe-063.arm.eng.rdu2.redhat.com openshift-qe-063.arm.eng.rdu2.redhat.comI0417 04:55:09.624908 1 controller.go:233] csr-bjl89: CSR not authorizedI0417 05:00:36.616963 1 controller.go:121] Reconciling CSR: csr-6jxmhI0417 05:00:36.631545 1 csr_check.go:163] csr-6jxmh: CSR does not appear to be client csrI0417 05:00:36.633987 1 csr_check.go:551] retrieving serving cert from localhost.localdomain (10.1.235.69:10250)I0417 05:00:36.636198 1 csr_check.go:188] Failed to retrieve current serving cert: remote error: tls: internal errorI0417 05:00:36.636220 1 csr_check.go:208] Falling back to machine-api authorization for localhost.localdomainE0417 05:00:36.636231 1 csr_check.go:398] csr-6jxmh: DNS name 'localhost.localdomain' not in machine names: openshift-qe-063.arm.eng.rdu2.redhat.com openshift-qe-063.arm.eng.rdu2.redhat.comI0417 05:00:36.636242 1 csr_check.go:211] Could not use Machine for serving cert authorization: DNS name 'localhost.localdomain' not in machine names: openshift-qe-063.arm.eng.rdu2.redhat.com openshift-qe-063.arm.eng.rdu2.redhat.comI0417 05:00:36.639079 1 controller.go:233] csr-6jxmh: CSR not authorizedI0417 05:00:37.305958 1 controller.go:121] Reconciling CSR: csr-bjl89I0417 05:00:37.321368 1 csr_check.go:163] csr-bjl89: CSR does not appear to be client csrI0417 05:00:37.339352 1 csr_check.go:551] retrieving serving cert from localhost.localdomain (10.1.235.69:10250)I0417 05:00:37.340990 1 csr_check.go:188] Failed to retrieve current serving cert: remote error: tls: internal errorI0417 05:00:37.341007 1 csr_check.go:208] Falling back to machine-api authorization for localhost.localdomainE0417 05:00:37.341017 1 csr_check.go:398] csr-bjl89: DNS name 'localhost.localdomain' not in machine names: openshift-qe-063.arm.eng.rdu2.redhat.com openshift-qe-063.arm.eng.rdu2.redhat.comI0417 05:00:37.341025 1 csr_check.go:211] Could not use Machine for serving cert authorization: DNS name 'localhost.localdomain' not in machine names: openshift-qe-063.arm.eng.rdu2.redhat.com openshift-qe-063.arm.eng.rdu2.redhat.com Additional info : 4.14 must-gather - https://drive.google.com/file/d/1H5-HhD8CDF7f5DXi02GAebpH8c-bwkOQ/view?usp=sharing 4.13 must-gather - https://drive.google.com/file/d/1Kc2tGtnw0DJCSDvVq-VUhj4U33lnEHMj/view?usp=sharing
Actual results:
install failed all workers in provisioned status
Expected results:
install successfull all workers Running
Additional info:
Job info in case needed - https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/196608/console
Profile - versioned-installer-rdu-ipi-bm