Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-32789

The CSI driver registrar container goes to error state with error `401 Unauthorized, restarting registration container`

XMLWordPrintable

    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      The CSI driver registrar container goes to error state with error `401 Unauthorized, restarting registration container`

      Version-Release number of selected component (if applicable):

      OCP 4.14.14 & OCP 4.14.18 BareMetal IPI

      How reproducible:

      100% in customer environment   

      Steps to Reproduce:

      1. Provision an OCP 4.14.15 or 4.14.18 cluster via BareMetal IPI
      
      2. Install the dell CSI drivers by following this document
      
      https://dell.github.io/csm-docs/docs/deployment/helm/drivers/installation/powerflex/
      
      3. Update the ingress certificate on the cluster as per the below document
      
      https://docs.openshift.com/container-platform/4.14/security/certificates/replacing-default-ingress-certificate.html#replacing-default-ingress_replacing-default-ingress
      
      
      Workaround to be used when pods go to crashloopbackoff state:
      ======
      `oc rollout restart ds/csi-vxflexos-node` or `oc delete pod <csi-vxflexos-node-xxx>` pods helps to bring pods in Running state.

      Actual results:

      After the ingress certificate update, the csi-vxflexos-node daemonset pods go into CrashLoopBackOff state.

      Expected results:

      Even after the ingress certificate update the csi-vxflexos-node daemonset pods should alway be in the Running state.   

      Additional info:

       Additional Info:
      
      The default log level verbosity for registrar container is 5 which seems include debug log, tried to change it to 9, then set --log_backtrace_at=main.go:101, and got below logs. 
      
      I0416 05:48:11.153702 1 node_register.go:55] Starting Registration Server at: /registration/csi-vxflexos.dellemc.com-reg.sock I0416 05:48:11.154054 1 node_register.go:64] Registration Server started at: /registration/csi-vxflexos.dellemc.com-reg.sock I0416 05:48:11.154188 1 node_register.go:88] Skipping HTTP server because endpoint is set to: "" I0416 05:48:12.383949 1 main.go:90] Received GetInfo call: &InfoRequest{} I0416 05:48:12.415201 1 main.go:101] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:false,Error:RegisterPlugin error -- plugin registration failed with err: rpc error: code = Unknown desc = 401 Unauthorized,} goroutine 51 [running]: k8s.io/klog/v2/internal/dbg.Stacks(0x0) /workspace/vendor/k8s.io/klog/v2/internal/dbg/dbg.go:35 +0x85 k8s.io/klog/v2.(*loggingT).output(0x109b3e0, 0x0, 0x0, 0xc000488000, 0x1, {0xd414be?, 0x1?}, 0x10?, 0x0) /workspace/vendor/k8s.io/klog/v2/klog.go:881 +0x179 k8s.io/klog/v2.(*loggingT).printfDepth(0x918364?, 0x70?, 0x0, {0x0, 0x0}, 0xae6500?, {0xb34a26, 0x2b}, {0xc0004821b0, 0x1, ...}) /workspace/vendor/k8s.io/klog/v2/klog.go:750 +0x1dd k8s.io/klog/v2.(*loggingT).printf(...) /workspace/vendor/k8s.io/klog/v2/klog.go:727 k8s.io/klog/v2.Infof(...) /workspace/vendor/k8s.io/klog/v2/klog.go:1508 main.registrationServer.NotifyRegistrationStatus({{0xc00003d6e0, 0x18}, {0x7ffe5897efd9, 0x37}, {0x107de50, 0x1, 0x1}}, {0xc0004ca000?, 0xc000490000?}, 0xc0004c2040) /workspace/cmd/csi-node-driver-registrar/main.go:101 +0xca k8s.io/kubelet/pkg/apis/pluginregistration/v1._Registration_NotifyRegistrationStatus_Handler({0xa47c80?, 0xc000420800}, {0xc0aa68, 0xc0004b2240}, 0xc0004cc000, 0x0) /workspace/vendor/k8s.io/kubelet/pkg/apis/pluginregistration/v1/api.pb.go:389 +0x169 google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001eec00, {0xc0aa68, 0xc0004b21b0}, {0xc0ee90, 0xc00030c1a0}, 0xc0004c0000, 0xc000413110, 0x1084a78, 0x0) /workspace/vendor/google.golang.org/grpc/server.go:1372 +0xe03 google.golang.org/grpc.(*Server).handleStream(0xc0001eec00, {0xc0ee90, 0xc00030c1a0}, 0xc0004c0000) /workspace/vendor/google.golang.org/grpc/server.go:1783 +0xfec google.golang.org/grpc.(*Server).serveStreams.func2.1() /workspace/vendor/google.golang.org/grpc/server.go:1016 +0x59 created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 68 /workspace/vendor/google.golang.org/grpc/server.go:1027 +0x115 E0416 05:48:12.418248 1 main.go:103] Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: rpc error: code = Unknown desc = 401 Unauthorized, restarting registration container.
      
      
      It should be a common issue, not only dell csi specific as the registrar container code comes from `https://github.com/kubernetes-csi/node-driver-registrar/blob/v2.10.0/cmd/csi-node-driver-registrar/main.go`.
      
      
      I understand that we do not support the Dell CSI driver. The customer is itself Dell and they complaining that their CSI driver working well with OCP 4.13, but with OCP 4.14 after updating ingress certificates, the csi-driver-node pods goes into crashloopbackoff.
      
      The Dell PowerFlex CSI driver version is 2.10.0 which they installed on OCP 4.13 and OCP 4.14.
      
      They also tried with Dell PowerFlex CSI driver version 2.9.2 on OCP 4.13 and 4.14 cluster, but the csi-driver-node pods go to crashloopbackoff only in the OCP 4.14 cluster after ingress certificate update.
      
      
      Reverting back the ingress certificate change does not help to bring the csi-driver-node pod back in Running state.

            fbertina@redhat.com Fabio Bertinatto
            rhn-support-dpateriy Divyam Pateriya
            Wei Duan Wei Duan
            Votes:
            16 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: