-
Epic
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
HW assisted DM integrity
-
-
rhel-sst-platform-storage
-
False
-
Description
There are new NVMe drives emerging that have the capability to do non-power-of-2 (NPo2) sector sizes, usually 512+8, 4k+8, or 4k+64 bytes. We can leverage this capability to improve the performance of dm-integrity substantially (> 2x).
dm-integrity is a target that allows us to store a small bit of information about each sector - usually a checksum or CRC. If the checksums match, we know that the data has not changed since it has been written - perhaps due to tampering or bit rot. In effect, we are checking the "integrity" of the data, hence the name.
We can leverage this target by putting it under RAID. If a sector is read and found to be bad, an error is returned. This error will trigger RAID to re-read the data from a redundant source and attempt a rewrite to the original bad sector. This often corrects the error (especially if due to bit rot) and is a form of self-healing. The extra amount of data required to store the checksum for this type of use case is small - 8 bytes will do.
We can also put dm-integrity under dm-crypt to enable authenticated encryption (ensuring authorship). This use case requires a bit stronger CRC and requires a bit more space per sector - 64 bytes will do.
Until this new NPo2 hardware became available, dm-integrity would store sector checksums separately on disk from the associated sectors. To do this atomically, a journal was required (or a bitmap). So, there is a lot of write amplification. A single sector write from an application would require writing to the journal, writing data, and separately writing the checksum. NPo2 HW allows us to do this in one step.
What SSTs and Layered Product teams should review this?
sst_logical_storage