-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.14
This is the 4.14 backport of STOR-1281. Original description below:
Currently, the Topology Feature is enabled by default by the openstack-cinder-csi-driver-operator. As seen in OCPBUGS-4697, this is problematic in environments where there is a mismatch between Nova and Cinder AZs, such as in DCN environments where there may be multiple nova AZs but only a single cinder AZ. On initial read, the [BlockStorage] ignore-volume-az would appear to offer a way out, but as I noted in OCPBUGS-4697 and upstream, this doesn't actually do what you'd think it does.
We should allow the user to configure this functionality via an operator-level configurable. We may wish to go one step further and also attempt to auto-detect the correct value by inspecting the available Nova and Cinder AZs. The latter step would require OpenStack API access from the operator, but both services do provide non-admin APIs to retrieve this information.
We also explored other options:
- Make the AZ reported by the CSI node configurable by the user/operator, like the Manila CSI driver appears to do via a --nodeaz flag. This would require upstream changes and wouldn't actually address the scenario where Nova and Cinder AZs are distinct sets: an AZ that was valid for Cinder would allow us to create the volume in Cinder, but it would still be applied to the resulting PV as a nodeAffinity field unless you also set ignore-volume-az. This would conflict with the Nova-based AZs assigned to Nodes and prevent scheduling of the PV to any Node.
- Change the meaning of [BlockStorage] ignore-volume-az to also avoid sending AZ information to Cinder, causing Cinder to fallback to [DEFAULT] default_availability_zone. This would also require upstream changes and would effectively give users two ways to disable the Topology feature. This would increase confusion and result in a worse UX.
- links to
-
RHBA-2024:3881 OpenShift Container Platform 4.14.z bug fix update