Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-10512

ConsoleNotificationSyncDegraded: Delete "https://172.30.0.1:443/apis/console.openshift.io/v1/consolenotifications/cluster-upgrade": net/http: TLS handshake timeout

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • 4.12.0
    • Management Console
    • None
    • No
    • Proposed
    • False
    • Hide

      None

      Show
      None

    Description

      Clusterversion: 4.12.0

      After running a longevity test on a ZTP SNO cluster under moderate load for 30-days, we experienced a random restart of many containers running on the cluster as well as a temporarily unresponsive kubeapi.
      See more details in https://issues.redhat.com/browse/OCPBUGS-10510

      At some point the node was restarted in attempt to fix everything.

      After the reboot the console CO doesn't come up.

      Running describe on console co was showing:

      Name:         console
      Namespace:    
      Labels:       <none>
      Annotations:  capability.openshift.io/name: Console
                    include.release.openshift.io/ibm-cloud-managed: true
                    include.release.openshift.io/self-managed-high-availability: true
                    include.release.openshift.io/single-node-developer: true
      API Version:  config.openshift.io/v1
      Kind:         ClusterOperator
      Metadata:
        Creation Timestamp:  2023-02-14T23:44:31Z
        Generation:          1
        Managed Fields:
          API Version:  config.openshift.io/v1
          Fields Type:  FieldsV1
          fieldsV1:
            f:metadata:
              f:annotations:
                .:
                f:capability.openshift.io/name:
                f:include.release.openshift.io/ibm-cloud-managed:
                f:include.release.openshift.io/self-managed-high-availability:
                f:include.release.openshift.io/single-node-developer:
              f:ownerReferences:
                .:
                k:{"uid":"0297348c-5756-4997-bfa9-ea68024b6351"}:
            f:spec:
          Manager:      cluster-version-operator
          Operation:    Update
          Time:         2023-02-14T23:44:31Z
          API Version:  config.openshift.io/v1
          Fields Type:  FieldsV1
          fieldsV1:
            f:status:
              .:
              f:extension:
          Manager:      cluster-version-operator
          Operation:    Update
          Subresource:  status
          Time:         2023-02-14T23:44:31Z
          API Version:  config.openshift.io/v1
          Fields Type:  FieldsV1
          fieldsV1:
            f:status:
              f:conditions:
              f:relatedObjects:
              f:versions:
          Manager:      console
          Operation:    Update
          Subresource:  status
          Time:         2023-03-17T21:51:26Z
        Owner References:
          API Version:     config.openshift.io/v1
          Kind:            ClusterVersion
          Name:            version
          UID:             0297348c-5756-4997-bfa9-ea68024b6351
        Resource Version:  12940156
        UID:               91f86953-433d-4d98-a0c5-17fc7fe40522
      Spec:
      Status:
        Conditions:
          Last Transition Time:  2023-03-17T20:33:40Z
          Message:               ConsoleNotificationSyncDegraded: Delete "https://172.30.0.1:443/apis/console.openshift.io/v1/consolenotifications/cluster-upgrade": net/http: TLS handshake timeout
      RouteHealthDegraded: console route is not admitted
          Reason:                ConsoleNotificationSync_FailedDelete::RouteHealth_RouteNotAdmitted
          Status:                True
          Type:                  Degraded
          Last Transition Time:  2023-03-17T21:26:05Z
          Message:               All is well
          Reason:                AsExpected
          Status:                False
          Type:                  Progressing
          Last Transition Time:  2023-03-17T21:51:26Z
          Message:               RouteHealthAvailable: console route is not admitted
          Reason:                RouteHealth_RouteNotAdmitted
          Status:                False
          Type:                  Available
          Last Transition Time:  2023-02-15T00:15:58Z
          Message:               All is well
          Reason:                AsExpected
          Status:                True
          Type:                  Upgradeable
        Extension:               <nil>
        Related Objects:
          Group:      operator.openshift.io
          Name:       cluster
          Resource:   consoles
          Group:      config.openshift.io
          Name:       cluster
          Resource:   consoles
          Group:      config.openshift.io
          Name:       cluster
          Resource:   infrastructures
          Group:      config.openshift.io
          Name:       cluster
          Resource:   proxies
          Group:      config.openshift.io
          Name:       cluster
          Resource:   oauths
          Group:      oauth.openshift.io
          Name:       console
          Resource:   oauthclients
          Group:      
          Name:       openshift-console-operator
          Resource:   namespaces
          Group:      
          Name:       openshift-console
          Resource:   namespaces
          Group:      
          Name:       console-public
          Namespace:  openshift-config-managed
          Resource:   configmaps
        Versions:
          Name:     operator
          Version:  4.12.0
      Events:       <none>
      

      The following was showing in the console logs:

      oc logs -n openshift-console                                  console-67f8b7674f-hxh8r 
      W0317 21:41:21.763514       1 main.go:227] Flag inactivity-timeout is set to less then 300 seconds and will be ignored!
      I0317 21:41:21.763558       1 main.go:346] cookies are secure!
      E0317 21:41:22.317033       1 auth.go:232] error contacting auth provider (retrying in 10s): Get "https://kubernetes.default.svc/.well-known/oauth-authorization-server": dial tcp: lookup kubernetes.default.svc on 172.30.0.10:53: read udp 10.128.0.193:59214->172.30.0.10:53: read: connection refused
      E0317 21:41:37.319330       1 auth.go:232] error contacting auth provider (retrying in 10s): Get "https://kubernetes.default.svc/.well-known/oauth-authorization-server": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
      E0317 21:41:48.013176       1 auth.go:232] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.qe2.kni.lab.eng.bos.redhat.com/oauth/token failed: Head "https://oauth-openshift.apps.qe2.kni.lab.eng.bos.redhat.com": dial tcp 10.19.134.5:443: connect: connection refused
      E0317 21:41:58.158333       1 auth.go:232] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.qe2.kni.lab.eng.bos.redhat.com/oauth/token failed: Head "https://oauth-openshift.apps.qe2.kni.lab.eng.bos.redhat.com": dial tcp 10.19.134.5:443: connect: connection refused
      E0317 21:42:10.079348       1 auth.go:232] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.qe2.kni.lab.eng.bos.redhat.com/oauth/token failed: Head "https://oauth-openshift.apps.qe2.kni.lab.eng.bos.redhat.com": dial tcp 10.19.134.5:443: connect: connection refused
      E0317 21:42:21.232435       1 auth.go:232] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.qe2.kni.lab.eng.bos.redhat.com/oauth/token failed: Head "https://oauth-openshift.apps.qe2.kni.lab.eng.bos.redhat.com": dial tcp 10.19.134.5:443: connect: connection refused
      E0317 21:42:32.731721       1 auth.go:232] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.qe2.kni.lab.eng.bos.redhat.com/oauth/token failed: Head "https://oauth-openshift.apps.qe2.kni.lab.eng.bos.redhat.com": dial tcp 10.19.134.5:443: connect: connection refused
      I0317 21:42:43.018052       1 main.go:796] Binding to [::]:8443...
      I0317 21:42:43.018152       1 main.go:798] using TLS
      2023/03/17 21:49:28 Failed to dial backend: 'dial tcp 172.30.0.1:443: connect: connection refused'
      2023/03/17 21:49:28 http: proxy error: dial tcp 172.30.0.1:443: connect: connection refused
      2023/03/17 21:49:28 Failed to dial backend: 'dial tcp 172.30.0.1:443: connect: connection refused'
      2023/03/17 21:49:31 Failed to dial backend: 'dial tcp 172.30.0.1:443: connect: connection refused'
      2023/03/17 21:49:31 Failed to dial backend: 'dial tcp 172.30.0.1:443: connect: connection refused'
      2023/03/17 21:49:33 Failed to dial backend: 'dial tcp 172.30.0.1:443: connect: connection refused'
      2023/03/17 21:49:33 Failed to dial backend: 'dial tcp 172.30.0.1:443: connect: connection refused'
      2023/03/17 21:49:37 Failed to dial backend: 'dial tcp 172.30.0.1:443: connect: connection refused'
      2023/03/17 21:49:38 Failed to dial backend: 'dial tcp 172.30.0.1:443: connect: connection refused'
      2023/03/17 21:49:49 http: TLS handshake error from 10.128.0.2:36436: EOF
      2023/03/17 21:49:49 http: TLS handshake error from 10.128.0.2:36442: read tcp 10.128.0.193:8443->10.128.0.2:36442: read: connection reset by peer
      2023/03/17 21:50:11 http: proxy error: context canceled
      2023/03/17 21:50:11 http: proxy error: context canceled
      2023/03/17 21:50:11 http: proxy error: context canceled
      2023/03/17 21:50:11 http: proxy error: context canceled
      2023/03/17 21:50:32 http: proxy error: context canceled
      2023/03/17 21:50:42 http: proxy error: context canceled
      2023/03/17 21:50:42 http: proxy error: context canceled
      2023/03/17 21:50:42 http: proxy error: context canceled
      2023/03/17 21:51:19 http: TLS handshake error from 10.128.0.2:46060: EOF
      2023/03/17 21:51:19 http: TLS handshake error from 10.128.0.2:46074: EOF
      2023/03/17 21:51:59 http: TLS handshake error from 10.128.0.2:35182: EOF
      2023/03/17 21:52:00 http: TLS handshake error from 10.128.0.2:35194: EOF
      

      Attachments

        Activity

          People

            jhadvig@redhat.com Jakub Hadvig
            achuzhoy@redhat.com Alexander Chuzhoy
            YaDan Pei YaDan Pei
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: