Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-31014

[Azure] Customer upgraded to 4.13.34 from 4.13.30, RBAC issues for system:nodes unable to create serviceaccount/token

    XMLWordPrintable

Details

    • No
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

          Cluster is unable to start up openshift-image-registry pods and openshift-logging pods with the error message
      
      142m        Warning   FailedMount              pod/image-registry-689549b84f-dk885                     MountVolume.SetUp failed for volume "bound-sa-token" : failed to fetch token: serviceaccounts "registry" is forbidden: User "system:node:worker-node-7qh98" cannot create resource "serviceaccounts/token" in API group "" in the namespace "openshift-image-registry": no relationship found between node 'worker-node-7qh98' and this 
      
      object142m        Warning   FailedMount              pod/image-registry-689549b84f-dk885                     MountVolume.SetUp failed for volume "kube-api-access-8ksnc" : failed to fetch token: serviceaccounts "registry" is forbidden: User "system:node:worker-node-7qh98" cannot create resource "serviceaccounts/token" in API group "" in the namespace "openshift-image-registry": no relationship found between node 'worker-node-7qh98' and this object

       

      Version-Release number of selected component (if applicable):

          4.13.35

      How reproducible:

          Unknown, first customer to see this issue

      Steps to Reproduce:

      Customers' upgrade path in finding this bug:    
      
          1. Install cluster at 4.12.44
          2. Upgrade to 4.13.30
          3. Upgrade to 4.13.34
      
      Actual results:
      {code:none}
          Image registry pods and logging pods continously recreate every 5-10 seconds never becoming healthy

      Expected results:

          Image registry and logging pods are able to create and run normally

      Additional info:

          This is the first time SRE is seeing this issue for a customer. Unforunately SRE has not identified a workaround we have:
      
      1. Deleted and recreated the image-registry deployment several times
      2. Updated the node selection for image-registry pods to be all nodes. This issue perists on all nodes on the cluster, even master nodes
      3. Inspected the role bindings of the cluster, system:node has the correct permissions

      Attachments

        Activity

          People

            Unassigned Unassigned
            lranjbar@redhat.com Lisa Rashidi-Ranjbar
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: