Uploaded image for project: 'ModeShape'
  1. ModeShape
  2. MODE-1988

Provide configuration option to enable background child optimization algorithm

    XMLWordPrintable

Details

    Description

      Currently, ModeShape persists its content within the Infinispan cache using a single JSON-like document for each node, and that document contains the properties and all child references. As the number of child references gets larger, materializing and changing/persisting that document become more expensive.

      ModeShape 3 was designed to allow lots of children within a single parent, and most of the infrastructure is in place to make this work. All children are accessed via the ChildReferences interface, and there are several implementations that are picked based purely upon how the child references are stored:

      • When stored in a single document (in the repository cache or in a non-paging connector), the ImmutableChildReferences.Medium implementation is used
      • When stored in multiple "blocks", the ImmutableChildReferences.Segmented implementation is used
      • When stored in a pageable connector, the ImmutableChildReferences.Segmented implementation is used
      • When the parent node is federated (that is, it has both internal and external children), the ImmutableChildReferences.FederatedChildReferences implementation is used

      At this time, ModeShape always stores internal nodes as documents with all child references (thus, never in blocks). However, we do have an optimization algorithm that was designed to operate in a background thread, looking for node documents with "too many" child references, and to break up that single document into multiple blocks.

      This request is to enable this background optimization logic:

      1. Add an optional field to the repository configuration to control this optimization, perhaps by exposing the target size and how often the algorithm should run (both should have good defaults). This should be DISABLED by default, until a time that the algorithm is well-proven.
      2. Add a background (scheduled) thread per the configuration that walks through all persisted documents (perhaps via Map/Reduce or distributed execution) and calls the optimization method.
      3. Possibly change the DocumentStore.optimizeChildrenBlocks(...) method to better work with how step 2 walks through the persisted documents.
      4. Add more tests for the DocumentStore.optimizeChildrenBlocks(...) method

      Attachments

        Issue Links

          Activity

            People

              rhauch Randall Hauch (Inactive)
              rhauch Randall Hauch (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: