In a CI run of etcd-operator-e2e I've found the following panic in the operator logs:
E0125 11:04:58.158222 1 health.go:135] health check for member (ip-10-0-85-12.us-west-2.compute.internal) failed: err(context deadline exceeded) panic: send on closed channel goroutine 15608 [running]: github.com/openshift/cluster-etcd-operator/pkg/etcdcli.getMemberHealth.func1() github.com/openshift/cluster-etcd-operator/pkg/etcdcli/health.go:58 +0xd2 created by github.com/openshift/cluster-etcd-operator/pkg/etcdcli.getMemberHealth github.com/openshift/cluster-etcd-operator/pkg/etcdcli/health.go:54 +0x2a5
which unfortunately is an incomplete log file. The operator recovered itself by restarting, we should fix the panic nonetheless.
Job run for reference:
https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_cluster-etcd-operator/1186/pull-ci-openshift-cluster-etcd-operator-master-e2e-operator/1750466468031500288
- blocks
-
OCPBUGS-28628 [4.15] Panic: send on closed channel
- Closed
- is cloned by
-
OCPBUGS-28628 [4.15] Panic: send on closed channel
- Closed
- is related to
-
OCPBUGS-36301 [4.17] Should run health checks in parallel to avoid spurious Available=False EtcdMembers_NoQuorum claims
- Closed
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update