Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-43796

Update stuck from 4.15.36 to 4.16.16 on a compact cluster with FIPS enabled.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.16.z
    • apiserver-auth
    • None
    • Critical
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

          Upgrade on a three-node compact cluster (with FIPS enabled) is stuck.
      
      $ oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.15.36   True        True          16h     Unable to apply 4.16.16: some cluster operators are not available
          - lastTransitionTime: "2024-10-22T14:48:43Z"
            message: Cluster operators authentication, openshift-apiserver are not available
            reason: ClusterOperatorsNotAvailable
            status: "True"
            type: Failing
          - lastTransitionTime: "2024-10-22T14:19:07Z"
            message: 'Unable to apply 4.16.16: some cluster operators are not available'
            reason: ClusterOperatorsNotAvailable
            status: "True"
            type: Progressing
      
      $ oc get co authentication openshift-apiserver
      NAME                  VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
      authentication        4.16.16   False       False         False      16h
      openshift-apiserver   4.16.16   False       False         False      16h
      
      $ oc get co |grep "4.15"
      dns                                        4.15.36   True        False         False      21h
      machine-config                             4.15.36   True        False         False      273d
      network                                    4.15.36   True        False         False      1y
      
      apiserver reporting
        conditions:
        - lastTransitionTime: "2023-02-10T18:17:54Z"
          message: All is well
          reason: AsExpected
          status: "False"
          type: Degraded
        - lastTransitionTime: "2024-10-22T18:16:17Z"
          message: All is well
          reason: AsExpected
          status: "False"
          type: Progressing
        - lastTransitionTime: "2024-10-22T14:47:44Z"
          message: 'APIServicesAvailable: PreconditionNotReady'
          reason: APIServices_PreconditionNotReady
          status: "False"
          type: Available
        - lastTransitionTime: "2023-02-10T18:15:23Z"
          message: All is well
          reason: AsExpected
          status: "True"
          type: Upgradeable
        - lastTransitionTime: "2024-10-22T14:47:41Z"
          reason: NoData
          status: Unknown
          type: EvaluationConditionsDetected
      
      
      kube-apiserver, openshift-apiserver and authentication are reporting the following errors:
      2024-10-23T09:39:14.208529772+02:00 W1023 07:39:14.208441       1 logging.go:59] [core] [Channel #50082 SubChannel #50085] grpc: addrConn.createTransport failed to connect to {Addr: "172.31.36.3:2379", ServerName: "172.31.36.3:2379", }. Err: connection error: desc = "error reading server preface: read tcp 10.149.0.155:39190->172.31.36.3:2379: use of closed network connection"
      2024-10-23T09:42:17.705910825+02:00 W1023 07:42:17.705817       1 logging.go:59] [core] [Channel #50251 SubChannel #50252] grpc: addrConn.createTransport failed to connect to {Addr: "172.31.36.1:2379", ServerName: "172.31.36.1:2379", }. Err: connection error: desc = "transport: authentication handshake failed: context canceled"
      
      But testing connectivity from those pods to the etcd enpoints with the mounted certificate works.
      
      Also verified etcd performance (which is not the best but looks not related).

      Version-Release number of selected component (if applicable):

          4.16.16

      Expected results:

          openshift-apiserver and openshift-authentication becomes available and update progress. 

              Unassigned Unassigned
              rhn-support-lperezbe Luis Perez Besa
              Xingxing Xia Xingxing Xia
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: