-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.18.0
-
Important
-
None
-
Rejected
-
False
-
[sig-arch] events should not repeat pathologically for ns/openshift-machine-api
The machine-api resource seems to not be responding to the `/healthz` requests from kubelet causing an increase in probe error events. The pod does seem to be up, and preliminary look at Loki is showing that the `/healthz` endpoint does seem to be up, but looses leader between, before starting the health probe again.
(read from bottom up)
I1016 19:51:31.418815 1 server.go:191] "Starting webhook server" logger="controller-runtime.webhook" I1016 19:51:31.418764 1 server.go:247] "Serving metrics server" logger="controller-runtime.metrics" bindAddress=":8082" secure=false I1016 19:51:31.418703 1 server.go:83] "starting server" name="health probe" addr="[::]:9441" I1016 19:51:31.418650 1 server.go:208] "Starting metrics server" logger="controller-runtime.metrics" 2024/10/16 19:51:31 Starting the Cmd. ... 2024/10/16 19:50:44 leader election lost I1016 19:50:44.406280 1 leaderelection.go:297] failed to renew lease openshift-machine-api/cluster-api-provider-machineset-leader: timed out waiting for the condition error E1016 19:50:44.406230 1 leaderelection.go:436] error retrieving resource lock openshift-machine-api/cluster-api-provider-machineset-leader: Get "https://172.30.0.1:443/apis/coordination.k8s.io/v1/namespaces/openshift-machine-api/leases/cluster-api-provider-machineset-leader": context deadline exceeded error E1016 19:50:37.430054 1 leaderelection.go:429] Failed to update lock optimitically: rpc error: code = DeadlineExceeded desc = context deadline exceeded, falling back to slow path error E1016 19:50:04.423920 1 leaderelection.go:436] error retrieving resource lock openshift-machine-api/cluster-api-provider-machineset-leader: the server was unable to return a response in the time allotted, but may still be processing the request (get leases.coordination.k8s.io cluster-api-provider-machineset-leader) error E1016 19:49:04.422237 1 leaderelection.go:429] Failed to update lock optimitically: rpc error: code = DeadlineExceeded desc = context deadline exceeded, falling back to slow path .... I1016 19:46:21.358989 1 server.go:83] "starting server" name="health probe" addr="[::]:9441" I1016 19:46:21.358891 1 server.go:247] "Serving metrics server" logger="controller-runtime.metrics" bindAddress=":8082" secure=false I1016 19:46:21.358682 1 server.go:208] "Starting metrics server" logger="controller-runtime.metrics" 2024/10/16 19:46:21 Starting the Cmd.
- links to
-
RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update