Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Undefined
Fix Version/s: odf-4.18
Affects Version/s: odf-4.16
Component/s: ocs-operator
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Dev Approval:
Committed
Docs Approval:
?
PM Approval:
?
Prod build version:
4.18.0-114
QE Approval:
Committed
Release Note Type:
Release Note Not Required
Target Release:

odf-4.18
Intelligence Requested:
Market:

Severity:
Moderate

Release Blocker:
Proposed
Target Version:

odf-4.18

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem - Provide a detailed description of the issue encountered, including logs/command-output snippets and screenshots if the issue is observed in the UI:

PrometheusDuplicateTimestamps alerts generated by rook-ceph-osd-key-rotation-X pods. The pods rook-ceph-osd-key-rotation-X have tolerations defined 2 times because of which the metrics are getting duplicated.

The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI):

All

The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc):

4.16.4

The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable):

4.16.z

Does this issue impact your ability to continue to work with the product?

There is no direct impact but the alert PrometheusDuplicateTimestamps keeps on popping up when rook-ceph-osd-key-rotation-X pods are completed.

Is there any workaround available to the best of your knowledge?

Deleting the completed jobs associated with rook-ceph-osd-key-rotation-X works in fixing the problem.

Can this issue be reproduced? If so, please provide the hit rate

100%

Can this issue be reproduced from the UI?

If this is a regression, please provide more details to justify this:

Steps to Reproduce:

1. Install and setup RHODF

2. Create StorageSystem, Enable Encryption using vault and enable taints on storage nodes as well.

3. Wait for the cronjobs rook-ceph-osd-key-rotation-X to be created.

4. Trigger jobs from these cronjobs so that new new pods are spin up.

5. After few minutes PrometheusDuplicateTimestamps

The exact date and time when the issue was observed, including timezone details:

Actual results:

PrometheusDuplicateTimestamps is streamed when pods are created by cronjobs rook-ceph-osd-key-rotation-X. This happens because of presence of duplicate tolerations for storage node in the associated cronjobs:

          tolerations:
          - effect: NoSchedule
            key: node.ocs.openshift.io/storage
            operator: Equal
            value: "true"
          - effect: NoSchedule
            key: node.ocs.openshift.io/storage
            operator: Equal
            value: "true"

Expected results:

When the cronjobs rook-ceph-osd-key-rotation-X are completed, then it should not trigger above said alert, and the associated pods should not have only duplicate toleration:

          tolerations:
          - effect: NoSchedule
            key: node.ocs.openshift.io/storage
            operator: Equal
            value: "true"
          - effect: NoSchedule

Logs collected and log location:

Below are the logs seen in prometheus-k8s pods:

2024-12-27T18:39:01.605719007Z ts=2024-12-27T18:39:01.605Z caller=scrape.go:1777 level=debug component="scrape manager" scrape_pool=serviceMonitor/openshift-monitoring/kube-state-metrics/0 target=https://<pod-ip>:8443/metrics msg="Duplicate sample for timestamp" series="kube_pod_tolerations{namespace=\"openshift-storage\",pod=\"rook-ceph-osd-key-rotation-21-28903680-49xxl\",uid=\"3a3c000a-1c8d-432c-a126-4f9291190902\",key=\"node.ocs.openshift.io/storage\",operator=\"Equal\",value=\"true\",effect=\"NoSchedule\"}"

Additional info:

links to

KCS

red-hat-storage/ocs-operator#2983: DFBUGS-1285: [release-4.18] Remove duplicate placements for rook-ceph dameons due to specifying with all & specific key

red-hat-storage/ocs-operator#2986: [release-4.17] DFBUGS-1489: Remove duplicate placements for rook-ceph dameons due to specifying with all & specific key

Assignee:: Malay Kumar Parida

Reporter:: Dhruv Gautam

QA Contact:: Vishakha Kathole

Votes:: 0 Vote for this issue

Watchers:: 32 Start watching this issue

Created:: 2025/01/06 12:16 PM

Updated:: 2025/10/29 2:49 PM

Resolved:: 2025/03/11 9:18 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty