-
Bug
-
Resolution: Not a Bug
-
Undefined
-
None
-
4.16.z
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Upgrade on a three-node compact cluster (with FIPS enabled) is stuck.
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.15.36 True True 16h Unable to apply 4.16.16: some cluster operators are not available
- lastTransitionTime: "2024-10-22T14:48:43Z"
message: Cluster operators authentication, openshift-apiserver are not available
reason: ClusterOperatorsNotAvailable
status: "True"
type: Failing
- lastTransitionTime: "2024-10-22T14:19:07Z"
message: 'Unable to apply 4.16.16: some cluster operators are not available'
reason: ClusterOperatorsNotAvailable
status: "True"
type: Progressing
$ oc get co authentication openshift-apiserver
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
authentication 4.16.16 False False False 16h
openshift-apiserver 4.16.16 False False False 16h
$ oc get co |grep "4.15"
dns 4.15.36 True False False 21h
machine-config 4.15.36 True False False 273d
network 4.15.36 True False False 1y
apiserver reporting
conditions:
- lastTransitionTime: "2023-02-10T18:17:54Z"
message: All is well
reason: AsExpected
status: "False"
type: Degraded
- lastTransitionTime: "2024-10-22T18:16:17Z"
message: All is well
reason: AsExpected
status: "False"
type: Progressing
- lastTransitionTime: "2024-10-22T14:47:44Z"
message: 'APIServicesAvailable: PreconditionNotReady'
reason: APIServices_PreconditionNotReady
status: "False"
type: Available
- lastTransitionTime: "2023-02-10T18:15:23Z"
message: All is well
reason: AsExpected
status: "True"
type: Upgradeable
- lastTransitionTime: "2024-10-22T14:47:41Z"
reason: NoData
status: Unknown
type: EvaluationConditionsDetected
kube-apiserver, openshift-apiserver and authentication are reporting the following errors:
2024-10-23T09:39:14.208529772+02:00 W1023 07:39:14.208441 1 logging.go:59] [core] [Channel #50082 SubChannel #50085] grpc: addrConn.createTransport failed to connect to {Addr: "172.31.36.3:2379", ServerName: "172.31.36.3:2379", }. Err: connection error: desc = "error reading server preface: read tcp 10.149.0.155:39190->172.31.36.3:2379: use of closed network connection"
2024-10-23T09:42:17.705910825+02:00 W1023 07:42:17.705817 1 logging.go:59] [core] [Channel #50251 SubChannel #50252] grpc: addrConn.createTransport failed to connect to {Addr: "172.31.36.1:2379", ServerName: "172.31.36.1:2379", }. Err: connection error: desc = "transport: authentication handshake failed: context canceled"
But testing connectivity from those pods to the etcd enpoints with the mounted certificate works.
Also verified etcd performance (which is not the best but looks not related).
Version-Release number of selected component (if applicable):
4.16.16
Expected results:
openshift-apiserver and openshift-authentication becomes available and update progress.