Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-2240

Adding topology-awareness to Cinder CSI Driver

XMLWordPrintable

    • Product / Portfolio Work
    • None
    • 0% To Do, 100% In Progress, 0% Done
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • None

      Goal

      The Cinder CSI driver reports the VOLUME_ACCESSIBILITY_CONSTRAINTS plugin capability, meaning it supports Topology-Aware Volume Provisioning, as described in the k8s CSI docs.

      Since OpenStack does not provide a mechanism to map compute nodes to block storage AZs, the Cinder CSI driver treats the compute AZ as a block storage AZ, assuming that the operator has used the same naming convention across their deployment (that is, if there are three compute AZs, az-0az-1, and az-2, then there will always be at least three block storage AZs with the same name and same semantic meaning (e.g. azN implies a particular rack, room, or data center for both the compute and block storage services). This is a reasonable position and is one the Nova project endorses, however, it isn't always true. Where a deployment is not doing and has divergent compute and block storage AZs, the Cinder CSI driver can end up requesting volumes with block storage AZs that don't exist.

      The way we have worked around this to date is to selectively enable or disable the topology feature flag provided to the external provisioner side car container, as deployed and managed by the Cinder CSI Driver Operator. This feature flag is being removed in a future release (when?), which means we can't rely on this long-term. We should therefore port the logic for determining whether or not to enable the topology feature from the Cinder CSI Driver Operator to the Cinder CSI Driver itself. Once this is done, we should remove the logic from the Operator since it should no longer be needed and will eventually not be supported.

      This epic tracks the above work.

      Why is this important?

      If we don't do this, we would lose the ability to disable the topology feature in environment where this is not supported (due to mismatched compute and block storage AZ sets). This will affect a number of customers.

              grosenbe-redhat.com Gil Rosenberg
              grosenbe-redhat.com Gil Rosenberg
              None
              None
              None
              None
              None
              Eric Rich Eric Rich
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: