Uploaded image for project: 'Managed Service - Streams'
  1. Managed Service - Streams
  2. MGDSTRM-9195

Reduce return to service time following abnormal broker shutdown

XMLWordPrintable

    • Reduce return to service time following abnormal broker shutdown
    • False
    • None
    • False
    • No
    • To Do
    • MGDSRVS-48 - Be able to sustain an external paying customer in production
    • 0% To Do, 0% In Progress, 100% Done
    • ---
    • ---

      WHAT

      If a kafka broker abnormally shuts down for any reason (for instance, node or storage failure), there is a chance that the broker may need to go into a log recovery  state on next startup in order to repair the log file.  Whilst it is in this state, the instance will be in a degraded state or even offline (depends on the number of brokers of the instance that need recovery).

      Recovery can be a time consuming process, especially for kafka broker with large amounts of data.

      RHOSAK is using kafka's default configuration is to use a single thread.  To reduce the return to service time the number of threads should be increased.

      WHY

      Reduce time taken to return an instance to full service.

      HOW

      Investigate the best number of recovery threads and verify that improvement that will be made in recover time.  See the spike task.

      Update the service to use the chosen number of threads.

       

       

            lukchen@redhat.com Luke Chen
            lukchen@redhat.com Luke Chen
            Kafka Integrations
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: