Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-8349

Bootstrap kubelet client cert should include system:serviceaccounts group

    XMLWordPrintable

Details

    • No
    • CLOUD Sprint 234
    • 1
    • Rejected
    • False
    • Hide

      Setting blocker+ as this is an install time bug with no workaround.

      Show
      Setting blocker+ as this is an install time bug with no workaround.
    • Hide
      * Previously, the bootstrap credentials used to request client credentials for control plane nodes did not include the generic, all service accounts group. As a result, the cluster machine approver ignored certificate signing requests (CSRs) created during this phase. In certain conditions, this prevented approval of CSRs during bootstrap and caused the installation to fail. With this release, the bootstrap credential includes the groups that the cluster machine approver expects for a service account. This change allows the machine approver to take over from the bootstrap CSR approver earlier in the cluster lifecycle and should reduce bootstrap failures related to CSR approval. (link:https://issues.redhat.com/browse/OCPBUGS-8349[*OCPBUGS-8349*])
      Show
      * Previously, the bootstrap credentials used to request client credentials for control plane nodes did not include the generic, all service accounts group. As a result, the cluster machine approver ignored certificate signing requests (CSRs) created during this phase. In certain conditions, this prevented approval of CSRs during bootstrap and caused the installation to fail. With this release, the bootstrap credential includes the groups that the cluster machine approver expects for a service account. This change allows the machine approver to take over from the bootstrap CSR approver earlier in the cluster lifecycle and should reduce bootstrap failures related to CSR approval. (link: https://issues.redhat.com/browse/OCPBUGS-8349 [* OCPBUGS-8349 *])
    • Bug Fix
    • Done
    • Customer Escalated

    Description

      Description of problem:

      On a freshly installed cluster, the control-plane-machineset-operator begins rolling a new master node, but the machine remains in a Provisioned state and never joins as a node.
      
      Its status is:
      Drain operation currently blocked by: [{Name:EtcdQuorumOperator Owner:clusteroperator/etcd}]
      
      The cluster is left in this state until an admin manually removes the stuck master node, at which point a new master machine is provisioned and successfully joins the cluster.

      Version-Release number of selected component (if applicable):

      4.12.4

      How reproducible:

      Observed at least 4 times over the last week, but unsure on how to reproduce.

      Actual results:

      A master node remains in a stuck Provisioned state and requires manual deletion to unstick the control plane machine set process.

      Expected results:

      No manual interaction should be necessary.

      Additional info:

       

      Attachments

        Issue Links

          Activity

            People

              joelspeed Joel Speed
              mbargenq Matt Bargenquast (Inactive)
              Zhaohua Sun Zhaohua Sun
              Jeana Routh Jeana Routh
              Matt Bargenquast (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: