-
Epic
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
CephFS Async Mirroring
-
False
-
-
False
-
Not Selected
-
No
-
Proposed
-
To Do
-
Proposed
-
?
-
-
CephFS supports asynchronous snapshot mirroring [1][2] since RHCS 5 (Ceph Pacific). This feature is enabled by the use of the `cephfs-mirror` daemon, and one daemon can be pre-configured per Ceph cluster. The daemon can be configured to replicate between filesystems on the same cluster, or to a peered Ceph cluster. Enabling this would help OpenStack Manila customers benefit from disaster recovery protections.
The primary objective of this Epic is to ensure that Manila's CephFS interactions are unaffected when mirroring is enabled; and we also need to identify what needs to be done to recover Manila shares after a mirror has been "promoted".
As to whether mirroring can be enabled/orchestrated with Manila itself:
OpenStack Manila supports share replication [3] with a flexible definition for asynchronous data mirroring of two types: "dr" for disaster recovery where the destination is unavailable to mount, but is continuously updated; and "readable" where the destination can be mounted. The RTO/RPO of the replication isn't controlled by Manila intentionally for flexibility. There's however, an expectation from share backends that:
- They support setting up a per-share replica
- They support replicating snapshots
- They support "promoting" a replica which reverses the direction of replication
While the third option is not possible with CephFS replication today, there's may be value in providing a mechanism to setup per-share replication, and allow creation of snapshots that are then automatically replicated.
An alternative is to build a CephFS mirroring strategy that's totally transparent to OpenStack Manila; i.e., the base CephFS filesystem can be configured to be mirrored, which encompasses all subvolumes. This can be done today outside of Manila; but there's no visibility whatsoever to Manila consumers. We believe the Manila CephFS driver should at least be able to detect the presence of mirroring and tag the storage pool/filesystem as being replicated - which allows OpenStack administrators to direct provisioning to this on-demand (via a driver-specific, scoped share type extra spec).
[2] https://docs.ceph.com/en/latest/dev/cephfs-mirroring/
[3] https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html
What are the use cases this RFE is solving?
Ensure all manila operations are possible (with regression testing) when mirroring is enabled. If any operations are affected, we would report bugs and address these bugs
High Level view on how the feature works
Is this feature driver dependent or driver related?
CephFS related
Can this feature be turned on or used in an existing environment?
Mirror daemon can be turned on in an existing deployment.
How will the feature affect performance or scaling?
Not on our end, more on Ceph’s side. Any mirrored subvolumes would be slower, but there is no manila feature that would mitigate or impact this.
What are the test cases for this RFE?
Test through scenario test
Are there CI implications?
Could test this by enabled mirroring and test through regression suite
Does it have documentation impact and require early planning with the doc team?
TBD