Loading...

XML

Word

Printable

Type: Task
Resolution: Done
Priority: Undefined
Fix Version/s: ACM 2.9.0
Affects Version/s: ACM 2.9.0
Component/s: Documentation, Observability
Labels:

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

Regression:
No

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Create an informative issue (See each section, incomplete templates/issues won't be triaged)

Using the current documentation as a model, please complete the issue template.

Note: Doc team updates the current version and the two previous versions (n-2). For earlier versions, we will address only high-priority, customer-reported issues for releases in support.

Prerequisite: Start with what we have

Always look at the current documentation to describe the change that is needed. Use the source or portal link for Step 4:

- Use the Customer Portal: https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes

- Use the GitHub link to find the staged docs in the repository: https://github.com/stolostron/rhacm-docs

Describe the changes in the doc and link to your dev story

Provide info for the following steps:

1. - [x] Mandatory Add the required version to the Fix version/s field.

2. - [x] Mandatory Choose the type of documentation change.

- [x] New topic in an existing section or new section:

I believe this should go in the release notes for ACM Obs 2.9 at https://github.com/stolostron/rhacm-docs/blob/2.9_stage/release_notes/whats_new.adoc#observability.

- [ ] Update to an existing topic

3. - [x] Mandatory for GA content:

- [x] Add steps and/or other important conceptual information here:

The Thanos Compactor is an important part of the ACM Observability product. It's deployed by the Multicluster Observability Operator (MCO). Its job is to ensure that queries will perform well. This is achieved through enforcement of the retention configuration and compaction of the data in storage. For the MCO to provide a good query experience, it's essential that the Thanos Compactor is healthy.

To help customers identify when the Thanos Compactor has issues, the MCO now includes 4 default alerts that are monitoring its health, with different severities:

ACMThanosCompactHalted, critical: fires if the Compactor is halted.

ACMThanosCompactHighCompactionFailures, warning: fires if the compaction failure rate is > 5%.

ACMThanosCompactBucketHighOperationFailures, warning: fires if bucket operation failure rate is > 5%.

ACMThanosCompactHasNotRun, warning: fires if compactor has not uploaded anything in last 24 hours.

More details about these rules can be found upstream at https://github.com/stolostron/multicluster-observability-operator/blob/main/operators/multiclusterobservability/manifests/base/alertmanager/prometheusrule.yaml.

- [ ] Add Required access level for the user to complete the task here:

- [ ] Add verification at the end of the task, how does the user verify success (a command to run or a result to see?)

- [x] Add link to dev story here: https://issues.redhat.com/browse/ACM-7362

4. - [ ] Mandatory for bugs: What is the diff? Clearly define what the problem is, what the change is, and link to the current documentation:

is related to

ACM-7321 Detect and alert when ACM compactor becomes unhealthy on the hub

Closed

Assignee:: Mikela Jackson

Reporter:: Douglas Camata (Inactive)

QA Contact:: Xiang Yin

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2023/11/06 4:47 PM

Updated:: 2023/11/13 2:27 PM

Resolved:: 2023/11/13 2:27 PM

Details

Description

Create an informative issue (See each section, incomplete templates/issues won't be triaged)

Prerequisite: Start with what we have

Describe the changes in the doc and link to your dev story

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates