Description of problem:
The aws-efs-csi-driver-operator seems to have an issue where it loses connectivity to the underlying EFS system. Reviewing Github this could be related to a memory leak in stunnel which causes process to die. This seems to be addressed in the efs-utils version v1.34.2 and leverages stunnel version v5.58. Threads linked in the additional information section below.
Version-Release number of selected component (if applicable):
Cluster Version: 4.10.34 Operator Version: 4.10.0-202211041323 stunnel Version: stunnel-5.56-5.el8_3.x86_64
How reproducible:
Sporadically in nature
Actual results:
It seems to fail with the following error message: `nfs: server 127.0.0.1 not responding, still trying` occasionally causing the pod to lose access to the storage.
Expected results:
NFS mount stays connected and filesystem is accessible in pods.
Additional info:
Upstream AWS-EFS-CSI-driver Issue: https://github.com/kubernetes-sigs/aws-efs-csi-driver/issues/616 Upstream AWS EFS-Utils Issue: https://github.com/aws/efs-utils/issues/99#issuecomment-1326960406 We appear to be based off of the awa-efs-utils v1.34.1 and using stunnel v5.56 in the UBI8 image. https://github.com/openshift/aws-efs-utils/blob/release-4.10/amazon-efs-utils.spec#L34
- clones
-
OCPBUGS-7814 [release-4.11] [AWS EFS] NFS mount disconnects and becomes unavailable.
- Closed
- depends on
-
OCPBUGS-7814 [release-4.11] [AWS EFS] NFS mount disconnects and becomes unavailable.
- Closed
- links to