Loading...

XML

Word

Printable

Type: Bug
Resolution: Duplicate
Priority: Undefined
Fix Version/s: None
Affects Version/s: openshift-4.11
Component/s: None
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Docs QE Status:
NEW
QE Status:
NEW

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem
======================

Since values of total inode capacity for filesystems with dynamic inode
allocation are not well defined (every such filesystem such as CephFS, XFS,
or Btrfs behaves slightly differently), it's not possible to interpret these
values in the same way as for "traditional" filesystems with static inode
allocation (such as ext4).

And Because alert KubePersistentVolumeInodesFillingUp doesn't distinquist
between the two cases, it could fire for PVCs backed by filesystems with
dynamic inode allocation causing a false alarm.

Version-Release number of selected component
============================================

OCP 4.11.0

How reproducible
================

100%

Steps to Reproduce
==================

1. Install OCP
2. Reconfigure OpenShift Container Platform registry to use RWX CephFS volume
provided by ODF
3. Use the cluster for a while
4. Check firing alerts

Actual results
==============

Alert KubePersistentVolumeInodesFillingUp is firing with the following
message:

The PersistentVolume claimed by registry-cephfs-rwx-pvc in Namespace
openshift-image-registry only has 0% free inodes.

In this particular case, there will be 2 such alerts, as there are 2 replicas
of the registry.

Expected results
================

Alert KubePersistentVolumeInodesFillingUp is not firing when RWX CephFS volume
is used to provide persistent storage for some OCP component.

Additional info
===============

The definition of the alert looks like this:

(kubelet_volume_stats_inodes_free{job="kubelet",metrics_path="/metrics",namespace=~"(openshift-.*|kube-.*|default)"} / kubelet_volume_stats_inodes{job="kubelet",metrics_path="/metrics",namespace=~"(openshift-.*|kube-.*|default)"}) < 0.03 and kubelet_volume_stats_inodes_used{job="kubelet",metrics_path="/metrics",namespace=~"(openshift-.*|kube-.*|default)"} > 0 unless on (namespace, persistentvolumeclaim) kube_persistentvolumeclaim_access_mode{access_mode="ReadOnlyMany",namespace=~"(openshift-.*|kube-.*|default)"} == 1 unless on (namespace, persistentvolumeclaim) kube_persistentvolumeclaim_labels{label_alerts_k8s_io_kube_persistent_volume_filling_up="disabled",namespace=~"(openshift-.*|kube-.*|default)"} == 1

So it looks like there was some attempt to prevent this from happening, but
without some reliable tracking which filesystem is used and whether we want to
take inode values seriously for given volume, the alert can't avoid false
alarms.

account is impacted by

OCPBUGS-17685 Alert KubePersistentVolumeInodesFillingUp

links to

openshift/runbooks#48: *: Add runbook for KubePersistentVolumeInodesFillingUp

Assignee:: Simon Pasquier

Reporter:: Martin Bukatovič

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2022/09/13 2:02 PM

Updated:: 2025/09/13 3:33 PM

Resolved:: 2022/10/26 12:48 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates