Developer story
As a WMCO developer, I want to deploy AzureCloudNodeManager as a DaemonSet so that it runs more efficiently, is easier to manage, and better aligns with RBAC native infrastructure practices.
Description
The AzureCloudNodeManager is currently running as a Windows service, which limits its scalability, deployment flexibility, and observability. Transition to run as a DaemonSet to allow the application to run on all Windows nodes, improve resource utilization, and make monitoring and maintenance consistent with other components.
Engineering Details
Refactor the AzureCloudNodeManager configuration in WMCO to support containerized deployment as part of the bundle.
Create a DaemonSet configuration for AzureCloudNodeManager.
Configure role-based access control (RBAC) permissions for the application within the Kubernetes cluster.
Implement environment-specific configurations using ConfigMaps and Secrets.
Ensure proper logging and monitoring setup using existing observability tools (e.g., Prometheus, Fluentd).
Test the deployment in a development Kubernetes cluster to ensure proper functionality and node-wide coverage.
Document deployment, configuration, and maintenance processes.
References
- https://github.com/openshift/windows-machine-config-operator/pull/2636
- https://github.com/kubernetes-sigs/cloud-provider-azure
- https://github.com/kubernetes-sigs/cloud-provider-azure/blob/6bc7669f7e57a41b973d5b74b05fdac10b7ba23d/examples/out-of-tree/cloud-node-manager.yaml#L103
(Any additional information that might be useful for engineers: related repositories or pull requests, related email threads, GitHub issues or other online discussions, how to set up any required accounts and/or environments if applicable, and so on.)
Acceptance Criteria
AzureCloudNodeManager runs as a DaemonSet across all Windows nodes in the cluster.
Existing Node Cloud Manager e2e test suite passes
Existing Windows node functionality remains unchanged and behaves as expected.
Logging and monitoring are successfully integrated e2e test suite
- is caused by
-
OCPBUGS-36671 [Windows VMs] Machine providerID should be consistent with node providerID
-
- Verified
-