Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-6046

DHSS=True with CephFS/NFS: per-tenant NFS servers

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • openstack-manila
    • DHSS=True with CephFS/NFS - per-tenant NFS servers
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • Committed
    • Proposed
    • To Do
    • Proposed
    • Committed
    • 100% To Do, 0% In Progress, 0% Done

      OpenStack Manila is capable of providing strong data and network path multi-tenant isolation guarantees with the "driver_handles_share_servers=True" driver mode. In this mode, individual OpenStack projects ("tenants") can provision a dedicated NAS service for their shares, and expect that have a dedicated network segment for their NAS traffic.

      Ceph supports the deployment and management of clustered NFS (nfs-ganesha) gateways. This feature can be integrated within Manila to orchestrate NFS-services on demand like other DHSS=True drivers (e.g.: Dell EMC Unity, NetApp, Generic driver, etc.) The feature's requirements and usecases are documented in a GitHub issue [3].

      The Ceph NFS service typically involves a cluster of NFS servers along with an ingress service. We only ever expect a small number of NFS clusters on a Ceph cluster since these are certainly resource intensive. This could change in the future as NFS becomes a popular choice within Ceph.

       

      [1] https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-server-management.html

      [2] https://www.youtube.com/watch?v=eBcmHMKZG9A 

      [3] https://github.com/nfs-ganesha/nfs-ganesha/issues/1087 

       

      What are the use cases this RFE is solving?

      https://github.com/nfs-ganesha/nfs-ganesha/issues/1087 

      Manila users want strong network path data isolation guarantees alongside data path isolation that Manila’s DHSS=False CephFS drivers currently provide. This RFE concerns only the NFS driver mode. Isolating network path for Native CephFS users is not in scope.

       

      High Level view on how the feature works

      Manila users create a share network by mapping their own network (this could be their private tenant network, or a shared-yet-private provider network provided by their administrators. When they use this share network to create a share, Manila creates a Ceph-NFS cluster dedicated to the user by allocating the VIP of the ceph-nfs service from the tenant’s share network.

       

      When shares are “exported” (i.e., when manila access rules are created), the export configuration is created on the appropriate share server (ceph-nfs service). 

       

      Is this feature driver dependent or driver related?

      Yes. DHSS=True driver mode is supported by a handful of vendor drivers today; this RFE concerns adding support to this driver mode in the CephFS driver.

       

      Are there any known limitations? (e.g multi attach + encryption)

      There are scale limitations associated with this. CPU and memory resources on the Ceph cluster may be limited; hence, only a limited number of ceph-nfs services can co-exist on the ceph cluster without overwhelming resources. Each ceph-nfs service is constituted by a number of individual NFS-Ganesha servers, each of which has a significant memory and CPU footprint. With newer ceph releases, the footprint is optimized; however, customers must be warned not to assume that this is a limitless resource - on the other hand, we may want to get RHCS to ascertain a theoretical limit based on available CPU/Memory/Disk for the number of ceph-nfs services we can provision

       

      Is a CLI change required, does the openstack cmd support it?

      DHSS=True is well supported in Manila; there is no new API being added, and so no CLI change is expected. 

       

      Does this RFE impact / need to be included into the control plane podification?

      No; the manila-operator already supports DHSS=True configurations. We don’t anticipate the operator needs to do anything differently when the CephFS driver supports this mode.

       

      Does this RFE benefit/impact DCN?

      For DCN, each edge site ceph cluster can have its own ceph nfs cluster. It’s already possible to deploy this manually. However, when this RFE is implemented, DCN users can benefit from deploying multiple NFS clusters via manila.

       

      Does this RFE benefit/impact shift on stack?

      Shift on Stack doesn’t currently support DHSS=True configuration. When support is added, the same benefits can be realized by Shift on Stack customers.

       

      Can this feature be turned on or used in an existing environment?

      Yes; however, this is a new feature that will require driver reconfiguration. When the feature is available, the driver mode has to be explicitly enabled and share types have to be created to take advantage of this feature.

       

      Does this feature affect another DFG or product?

      No

       

      Does this feature depend on another RFE?

      No

       

      How will the feature affect Upgrades?

      No impact

       

      How will the feature affect performance or scaling?

      Yes. The Ceph-NFS service’s scale and performance depends on the resources allocated to it within the ceph cluster. We must refer to the perf/scale tests conducted with RHCS to determine what we can document for Manila customers. 

       

      What are the test cases for this RFE?

      DHSS=True tests exist within manila-tempest-plugin; we would enable them to test the functionality. The tests can be configured to limit the number of share servers that get created on the cluster, in case it is necessary to not overwhelm our test clusters.

       

      Are there CI implications?

      Currently, Manila’s DHSS=True configuration is not tested in the CI because we don’t have the flexibility to plumb in the NetApp in the lab to enable this sort of testing. So, any tests are done manually. We might have more flexibility to test CephFS in this configuration though, since the ceph storage system would be deployed alongside the OpenStack cluster.

       

      Does it have documentation impact and require early planning with the doc team?

      Yes, we would need to document how administrators can enable this driver mode in their deployment and how users can use this. There is some user documentation in the Persistent Storage Guide wrt DHSS=True; we might need to audit it and see if we have gaps.

       

      Are there known packaging challenges?

      No impact on packages

       

      Are there any security considerations?

      DHSS=True will enhance security when users consume CephFS in multi-tenant environments.

       

      How much upstream resistance might there be to this feature?

      A lot of interest has been expressed upstream for this feature.

       

      Will this feature require new or different support skills?

      Yes; GSS will need to know how to troubleshoot issues with ceph-nfs configuration.

       

      Will this be required for knowledge transfer to GSS?

      Yes

       

      Will this feature impact existing partners or certification programs?

      No impact on partners

       

      API Deprecation/Compatibility?

      None

       

      Are any GUI impact/changes required (Horizon)?

      None

              Unassigned Unassigned
              ashrodri@redhat.com Ashley Rodriguez
              rhos-dfg-storage-squad-manila
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: