Description of problem:
On OpenShift Container Platform, the etcd Pod is showing messages like the following: 2023-06-19T09:10:30.817918145Z {"level":"warn","ts":"2023-06-19T09:10:30.817Z","caller":"fileutil/purge.go:72","msg":"failed to lock file","path":"/var/lib/etcd/member/wal/000000000000bc4b-00000000183620a4.wal","error":"fileutil: file already locked"} This is described in KCS https://access.redhat.com/solutions/7000327
Version-Release number of selected component (if applicable):
any currently supported version (> 4.10) running with 3.5.x
How reproducible:
always
Steps to Reproduce:
happens after running etcd for a while
This has been discussed in https://github.com/etcd-io/etcd/issues/15360
It's not a harmful error message, it merely indicates that some WALs have not been included in snapshots yet.
This was caused by changing default numbers: https://github.com/etcd-io/etcd/issues/13889
This was fixed in https://github.com/etcd-io/etcd/pull/15408/files but never backported to 3.5.
To mitigate that error and stop confusing people, we should also supply that argument when starting etcd in: https://github.com/openshift/cluster-etcd-operator/blob/master/bindata/etcd/pod.yaml#L170-L187
That way we're not surprised by changes of the default values upstream.
- blocks
-
OCPBUGS-16804 [4.13] silence irrelevant "failed to lock file fileutil: file already locked" warnings
- Closed
- is cloned by
-
OCPBUGS-16804 [4.13] silence irrelevant "failed to lock file fileutil: file already locked" warnings
- Closed
- links to
-
RHSA-2023:5006 OpenShift Container Platform 4.14.z security update