Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Major
Fix Version/s: None
Affects Version/s: 4.13, 4.12, 4.11, 4.10, 4.14, 4.15, 4.16
Component/s: Storage
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
No

Target Backport Versions:

4.13.z, 4.12.z, 4.14.z, 4.15.z
Target Version:

4.16.0
Release Blocker:
Approved
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Test Coverage:

?

PX Priority Data:
PX Impact Score:

Release Note Status:
Done
Release Note Type:
Bug Fix
Release Note Text:

Hide
* Previously, CPU limits for the {aws-first} Elastic File Store (EFS) Container Storage Interface (CSI) driver container could cause performance degradation of volumes managed by the {aws-short} EFS CSI Driver Operator. With this release, the CPU limits from the {aws-short} EFS CSI driver container are removed to help prevent potential performance degradation. (link:https://issues.redhat.com/browse/OCPBUGS-28551[*~~OCPBUGS-28551~~*]

Show
* Previously, CPU limits for the {aws-first} Elastic File Store (EFS) Container Storage Interface (CSI) driver container could cause performance degradation of volumes managed by the {aws-short} EFS CSI Driver Operator. With this release, the CPU limits from the {aws-short} EFS CSI driver container are removed to help prevent potential performance degradation. (link: https://issues.redhat.com/browse/OCPBUGS-28551 [* OCPBUGS-28551 *]

Escape Reason:
Escape Impact:
Corrective Measures:
SDLC stage when should've been found:

Description of problem:

When a EFS based volume is mounted by the driver (csi-driver) in the daemonset aws-efs-ci-driver-node a new stunnel process is also launched. This process, used to encrypt the I/O traffic of the NFS filesystem, that can be CPU intensive under load conditions, becomes throttled by the the CPU limits configured on the csi-driver container (100m) https://github.com/openshift/aws-efs-csi-driver-operator/blob/release-4.16/assets/node.yaml#L81-L83

This CPU throttling is leading to a high performance degradation of all volumes managed by the operator.

How reproducible:

Create a pod with a EFS pvc attached and run a simple performance test on this volume

i.e:
fio --ioengine=libaio --iodepth=4 --runtime=60 --bs=1MiB --time_based=1 --
filename=file --rw=read --size=2GiB --name=readjob --direct=1

Repeat the previous test after removing cpu limits of the csi-driver container of the daemonset aws-efs-ci-driver-node. This can be done by configuring the resource ClusterCSIDriver/efs.csi.aws.com to Unmanaged state

Results using the default configuration:

sh-5.2$ fio --ioengine=libaio --iodepth=4 --runtime=60 --bs=1MiB --time_based=1 --filename=file --rw=read --size=2GiB --name=readjob --direct=1
readjob: (g=0): rw=read, bs=(R) 977KiB-977KiB, (W) 977KiB-977KiB, (T) 977KiB-977KiB, ioengine=libaio, iodepth=4
<truncated>
READ: bw=95.2MiB/s (99.9MB/s), 95.2MiB/s-95.2MiB/s (99.9MB/s-99.9MB/s), io=5717MiB (5995MB), run=60031-60031msec

Results after removing cpu limits

sh-5.2$ fio --ioengine=libaio --iodepth=4 --runtime=60 --bs=1MiB --time_based=1 --filename=file --rw=read --size=2GiB --name=readjob --direct=1
readjob: (g=0): rw=read, bs=(R) 977KiB-977KiB, (W) 977KiB-977KiB, (T) 977KiB-977KiB, ioengine=libaio, iodepth=4
<truncated>
READ: bw=507MiB/s (532MB/s), 507MiB/s-507MiB/s (532MB/s-532MB/s), io=29.7GiB (31.9GB), run=60006-60006msec

blocks

OCPBUGS-28645 EFS CSI performance degradation due to CPU limits

Closed

is cloned by

OCPBUGS-28645 EFS CSI performance degradation due to CPU limits

Closed

is triggering

STOR-2354 Corrective Measure for OCPBUGS-28551: EFS CSI performance degradation due to CPU limits

Closed

links to

openshift/aws-efs-csi-driver-operator#117: OCPBUGS-28551: Remove CPU limits from the driver container

RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update

Assignee:: Tomas Smetana

Reporter:: Raul Sevilla Canavate

Need Info From:: None

Contributors:: None

QA Contact:: Rohit Patil

Doc Contact:: None

Votes:: 1 Vote for this issue

Watchers:: 9 Start watching this issue

Created:: 2024/01/29 12:44 PM

Updated:: 2025/09/12 11:25 PM

Resolved:: 2024/06/27 11:36 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates