Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-18950

Degraded etcd on agent assisted-installer installation- bootstrap etcd is not removed properly

    XMLWordPrintable

Details

    • Critical
    • No
    • Sprint 244, Sprint 246
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      Etcd operator is in degraded state as one of the masters can't connect.
      Master that fails to connect was previously bootstrap and pivoted as part of assisted-installer installation to master.
      
      ~~~ 
      Message: 660321f5c05c23f2, started, etcd-bootstrap, https://192.168.15.11:2380, https://192.168.15.11:2379, false
      66b18d00d45644a1, started, control-plane3, https://192.168.15.13:2380, https://192.168.15.13:2379, false
      a913f5c640e7c99b, started, control-plane2, https://192.168.15.12:2380, https://192.168.15.12:2379, false
      attempt 0
      member={name="etcd-bootstrap", peerURLs=[https://192.168.15.11:2380}, clientURLs=https://192.168.15.11:2379
      member={name="control-plane3", peerURLs=[https://192.168.15.13:2380}, clientURLs=https://192.168.15.13:2379
      member={name="control-plane2", peerURLs=[https://192.168.15.12:2380}, clientURLs=https://192.168.15.12:2379
      target={name="etcd-bootstrap", peerURLs=[https://192.168.15.11:2380}, clientURLs=https://192.168.15.11:2379
      member "https://192.168.15.11:2380" dataDir has been destroyed and must be removed from the cluster
      Exit Code: 1
      Started: Mon, 04 Sep 2023 22:36:48 +0000
      Finished: Mon, 04 Sep 2023 22:36:48 +0000
      Ready: False
      Restart Count: 366
      ~~~

      Version-Release number of selected component (if applicable): 

      4.13.6 

      How reproducible:

      N/A - Randomly detected on bare-metal and Virtual environments 

      Steps to Reproduce:

      Not clear at this moment. 

      Actual results:

      Etcd is degraded, causing third joined master etcd can't start

      Expected results:

      Cluster in a healthy state without manual interaction. 

      Additional info:

      KB https://access.redhat.com/solutions/6962106 describe the workaround for this issue, but it seems that this is a new ocurrence of Bug Report [OCPBUGS-5988](https://issues.redhat.com/browse/OCPBUGS-5988). 

      Attachments

        Issue Links

          Activity

            People

              bfournie@redhat.com Robert Fournier
              rhn-support-arolivei Arthur de Oliveira
              ge liu ge liu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: