Uploaded image for project: 'ModeShape'
  1. ModeShape
  2. MODE-1053

Use a more scalable structure for storing version histories

    Details

    • Type: Feature Request
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: 2.3.0.Final
    • Fix Version/s: 2.4.0.Final, 2.2.1.GA
    • Component/s: JCR
    • Labels:
      None

      Description

      Currently, the version history for each 'mix:versionable' node is stored as a flat structure under '/jcr:system/jcr:versionStorage', where each version history is named by the UUID of the versionable node. This structure is flat and does not scale very well for large numbers of versionable nodes (and thus large numbers of version history nodes).

      The specification only states that the version history nodes should appear in the '/jcr:system/jcr:versionStorage' area, but does not require that they be immediately under the '/jcr:system/jcr:versionStorage' node.

      As an alternative to this structure, ModeShape should be able to store the version histories in a structure that will scale to larger numbers of versionable nodes. One oft-mentioned approach is to organize the version history nodes in a mult-tier structure that still is based upon the UUID: use the first 2 characters for the first-level, the second 2 characters for the second-level, the third 2 characters for the third level, the fourth 2 characters for the fourth level, and the remaining characters of the UUID as the node name. The version history nodes for nodes with UUIDs that begin with the same characters will be stored closer together.

      The benefits of this approach are:
      1) each of the 1-4 levels that uses 2 characters can contain at most 256 children (16 * 16), since the string-form of the UUID consists of hexadecimal numbers.
      2) despite each level having at most 256 children, this structure can hold 256^5 (over 1 quadrillion) nodes
      3) the first characters of the UUIDs generally change very frequently, meaning the version history nodes will be well-distributed.

      ModeShape should be configurable to use either format, and because the flat structure is already in use, it should be possible to switch a configuration for an existing repository to use the hierarchical structure without modifying the content. (In effect, an existing repository created using the flat structure might contain version histories organized in both the flat and hierarchical structures.)

      The default structure should be hierarchical, as that is far more scalable.

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                rhauch Randall Hauch
                Reporter:
                rhauch Randall Hauch
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: