Uploaded image for project: 'Red Hat Data Grid'
  1. Red Hat Data Grid
  2. JDG-1185

Broadcast query support for improved scalability

    XMLWordPrintable

Details

    • Feature Request
    • Resolution: Done
    • Major
    • JDG 7.2 ER1
    • None
    • Querying
    • None

    Description

      JDG currently works with two types of indexes:

      1) single index cluster wide, with indexes stored in caches
      2) replicated indexes on each node, with either ram or filesystem indexes in each node

      Both strategies have issues regarding scalability since the index must be present in the node that is querying. Furthermore, 2) only works for REPL caches, leaving 1) with the only supported strategy for DIST caches.

      Strategy 1) has two issues: only one node in the cluster is responsible to do all the indexing, and furthermore it expects the (global) index to be accessible in the node where the query is done. If the query requires reading an index segment that is not local, the query engine will fetch it in order to run the query, causing high latency due to the amount of RPC and data transferred.

      The broadcast query feature [1] allows each node to index its own data during writes, and at query time, it sends the query to each node. An extra step is required to combine the results from all nodes. This is ideal for DIST caches with large indexes since the amount of data transferred is the query itself and the results.

      [1] http://infinispan.org/docs/stable/user_guide/user_guide.html#query.clustered-query-api

      Attachments

        Issue Links

          Activity

            People

              gfernand@redhat.com Gustavo Fernandes (Inactive)
              gfernand@redhat.com Gustavo Fernandes (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: