-
Bug
-
Resolution: Done
-
Major
-
None
-
4.18
-
Quality / Stability / Reliability
-
False
-
-
3
-
Important
-
None
-
None
-
None
-
None
-
WINC - Sprint 268
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
There is an SSH connectivity issue affecting Windows nodes on Azure 2019, which might be the root cause (RCA) for test failures. When attempting to scale up a Windows MachineSet, the new instance fails to be configured due to SSH connectivity issues.
Version-Release number of selected component (if applicable):
Cloud Provider: Azure OS: Windows Server 2019 OpenShift Version: 4.18.0-0.nightly-2025-02-19-132725 WMCO Version: 10.18.0
How reproducible:
100%
Steps to Reproduce:
1.Run the following command to scale the Windows MachineSet oc scale machineset windows --replicas=3 -n openshift-machine-api 2.Observe that the new Windows node fails to be configured.
Actual results:
The Windows instance fails to set up due to SSH connectivity issues. Error log: "error":"unable to configure instance windows-kfp26: failed to create new nodeconfig: error instantiating Windows instance from VM: unable to setup VM 10.0.128.7 sshConnectivity: error instantiating SSH client: unable to connect to Windows VM 10.0.128.7: timed out waiting for the condition"
Expected results:
The Windows node should be configured successfully after scaling. SSH connectivity should be established without timeout issues.
Additional info:
This issue blocks the scaling of Windows nodes, which could impact test automation and production workloads. Happens on both azure 2019 and 2022, NOT on nutanix or AWS-Proxy
wmco and must-gather log:
https://drive.google.com/drive/folders/15bx4p0annHEJqvwqCmVPn9tiwALPtdVP