Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-62064

Kube-apiserver failing Due to Orphaned etcd Entries

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Major Major
    • None
    • 4.16.z
    • kube-apiserver
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      A customer reported that after they configured AAD SSO for the OpenShift console, users were redirected back to the login page in a loop. A similar issue occurred when trying to log in via the CLI with a token, which resulted in an "error: token provided is invalid or expired." The issue was caused by orphaned entries in the etcd database that remained after the customer deleted resources like namespaces and configmaps. These orphaned entries prevented the kube-apiserver from starting properly.
      
      What would cause there to be orphaned entries in the etcd database and is this expected behavior?

      kube-apiserver ClusterOperator:

      kube-apiserver   4.16.37   True   True   False   2y143d   NodeInstallerProgressing: 3 nodes are at revision 642; 0 nodes have achieved new revision 648  

       

      Logs from this apiserver pod show:

       

      oc logs -n openshift-kube-apiserver kube-apiserver-ip-10-128-40-151 -c kube-apiserver
      
      informer-sync check failed: readyz [-]informer-sync failed: 2 informers not started yet: [*v1.Secret *v1.ConfigMap]
      
      "failed to decrypt data" err="no matching key was found for the provided AES transformer"
      
      failed to list *core.ConfigMap: unable to transform key "/kubernetes.io/configmaps/..."
      
      failed to list *core.Secret: unable to transform key "/kubernetes.io/secrets/..."apiserver was unable to write a JSON response: http: Handler timeout 

       

       

      Version-Release number of selected component (if applicable):

          seen on 4.16.37

      How reproducible:

          did not reproduce

      Steps to Reproduce:

        1. encrypt etcd during cluster deployment
        2. delete a resource but entry is not removed from etcd

      Actual results:

      Resources is deleted and the subsequent entry is not removed from etcd.
      kube-apiserver pods no longer functional due to `no matching key was found for the provided AES transformer'

      Expected results:

      Resource is deleted and the corresponding entry in etcd is removed

      Additional info:

      Similar OCP BUG: https://issues.redhat.com/browse/OCPBUGS-38598

              Unassigned Unassigned
              reedcort Cortney Reed
              None
              None
              Ge Liu Ge Liu
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: