Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-26307

Modification of acceptance criteria for succesful pool-level metadata rewrite

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • stratisd
    • None
    • sst_logical_storage
    • ssg_platform_storage
    • 5
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None

      Goal

      At present:

      • When writing pool-level metadata, stratisd does not write to all block devices, but to a maximum of ten randomly chosen devices. If there are fewer than ten, it will write to all devices.
      • It can not write to metadata regions where the space for the metadata is too small for the existing metadata.
      • It considers the write a success if metadata was written to exactly one device.

      For RAID support we need:

      • A more stringent criterion for a lower bound for the number of devices that must be written to for the metadata write to be considered a success. Due to the nature of RAID-10, it is possible for a fraction of the devices to be missing, but for it to still be possible to set up a RAID-10 device in a degraded state. The difficulty is that, for example, for a 2-replica RAID device, if 1/2 the devices are missing so that there is 1 replica of each device AND the devices that are missing contain the newest metadata THEN the RAID device can be set up using entirely old metadata, which may contain incorrect information about other parts of the pool configuration. Thus, the number of devices which it must be required for the metadata to be written to so that that new metadata must be present and read is 1/2 (# of devices in the array) + 1, assuming the array is not degraded and has three replicas. Generalizing to the # of replicas in the array, either 2 or 3, we have (# of replicas - 1) / # of replicas * (# of devices in array) + 1 as the required minimum. However, if the RAID device is sufficiently degraded we can not satisfy that minimum value, and the best that we can do is "all".
      • For a very big RAID-10 array, the maximum number written, currently ten, could be less than the minimum number required. Our calculations should take that into account.
      • To verify that we reserve metadata regions correctly when claiming new Stratis devices, i.e., that we allow sufficient space for the pool-level metadata.
      • Note that, when setting up a pool, stratisd may discover some devices that are failed, so that they can not be immediately included in the RAID array, but where the Stratis poo-level metadata is still readable and writable. In that case, I believe that it would be appropriate to update the poo-level metadata on thos devices. So these devices would be included among the devices written to and would be included in the calculation when determining whether enough devices had been written to.

      Acceptance Criteria

      A list of verification conditions, successful functional tests, or expected outcomes in order to declare this story/task successfully completed.

      • Verify correct calculation of the minimum # of devices that the pool-level metadata must be written to.
      • Verify that size of space for Stratis pool-level metadata is always adequate for newly claimed devices.

            amulhern@redhat.com the Mulhern
            amulhern@redhat.com the Mulhern
            stratis-team stratis-team
            Filip Suba Filip Suba
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: