Loading...

XML

Word

Printable

Type: Epic
Resolution: Done
Priority: Critical
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- kafka-integrations-apac-refinement-done
- kafka-integrations-europe-refinement-done

Epic Name:
Reduce return to service time following abnormal broker shutdown
Blocked:
False
Blocked Reason:
None
Ready:
False
Discussed with Team:
No
Epic Status:
To Do
Feature Link:
MGDSRVS-48 - Be able to sustain an external paying customer in production
Hierarchy Progress Bar:

0% To Do, 0% In Progress, 100% Done
[QE] How to address?:
---
[QE] Why QE missed?:
---

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

WHAT

If a kafka broker abnormally shuts down for any reason (for instance, node or storage failure), there is a chance that the broker may need to go into a log recovery state on next startup in order to repair the log file. Whilst it is in this state, the instance will be in a degraded state or even offline (depends on the number of brokers of the instance that need recovery).

Recovery can be a time consuming process, especially for kafka broker with large amounts of data.

RHOSAK is using kafka's default configuration is to use a single thread. To reduce the return to service time the number of threads should be increased.

WHY

Reduce time taken to return an instance to full service.

HOW

Investigate the best number of recovery threads and verify that improvement that will be made in recover time. See the spike task.

Update the service to use the chosen number of threads.

relates to

MGDSTRM-9154 Expose log recovery metrics on support dashboards

Backlog

Assignee:: Luke Chen

Reporter:: Luke Chen

Team:: Kafka Integrations

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2022/07/15 4:27 PM

Updated:: 2022/10/25 12:14 PM

Resolved:: 2022/10/25 12:14 PM

Details

Description

WHAT

WHY

HOW

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates