Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-29253

vSphere machines are getting into provisioned status "expected 1 secret for SA 'windows-instance-config-daemon', found 2""

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • 4.15
    • Windows Containers
    • Moderate
    • No
    • 3
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

          vSphere nodes are disappearing after scaling down and up some windows machines, the machines are in Provisioned state and can't get ready, nodes are not appearing anymore.

      Version-Release number of selected component (if applicable):

          10.15.0-ae56369
          4.15.0-0.nightly-2024-02-07-062935

      How reproducible:

      most likely    

      Steps to Reproduce:

          1. Install WMCO latest 10.15
          2. Create 2 machineset nodes on vSphere VCenter
          3. wait for nodes to be ready
          4. run scale down to 0 Windows machineset
          5. scale up back to 2 machines 
          

      Actual results:

          Nodes are not back after scaling up, the machines are stuck in provisioning, workloads are in pending state

      Expected results:

          Windows nodes should not disappear

      Additional info:

          wmco log: 
      {"level":"info","ts":"2024-02-08T13:07:28Z","logger":"controller.windowsmachine","msg":"processing","windowsmachine":{"name":"winworker-nwhr5","namespace":"openshift-machine-api"},"address":"192.168.221.139"}
      {"level":"error","ts":"2024-02-08T13:07:33Z","msg":"Reconciler error","controller":"machine","controllerGroup":"machine.openshift.io","controllerKind":"Machine","Machine":{"name":"winworker-nwhr5","namespace":"openshift-machine-api"},"namespace":"openshift-machine-api","name":"winworker-nwhr5","reconcileID":"0a012274-2921-436a-9ef9-75761208bdc2","error":"unable to configure instance 423d2593-9f52-2506-2c38-e0decef2b967: expected 1 secret for SA 'windows-instance-config-daemon', found 2","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}
      
      Pod logs:
      
      oc describe pod/win-webserver-7c66c4b657-8v487
      Name:           win-webserver-7c66c4b657-8v487
      Namespace:      winc-test
      Priority:       0
      Node:           <none>
      Labels:         app=win-webserver
                      pod-template-hash=7c66c4b657
      Annotations:    <none>
      Status:         Pending
      IP:
      IPs:            <none>
      Controlled By:  ReplicaSet/win-webserver-7c66c4b657
      Containers:
        win-webserver:
          Image:      mcr.microsoft.com/powershell:lts-nanoserver-ltsc2022
          Port:       <none>
          Host Port:  <none>
          Command:
            pwsh.exe
            -command
            $listener = New-Object System.Net.HttpListener; $listener.Prefixes.Add('http://*:80/'); $listener.Start();Write-Host('Listening at http://*:80/'); while ($listener.IsListening) { $context = $listener.GetContext(); $response = $context.Response; $content='<html><body><H1>Windows Container Web Server</H1></body></html>'; $buffer = [System.Text.Encoding]::UTF8.GetBytes($content); $response.ContentLength64 = $buffer.Length; $response.OutputStream.Write($buffer, 0, $buffer.Length); $response.Close(); };
          Environment:  <none>
          Mounts:
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-t6sd6 (ro)
      Conditions:
        Type           Status
        PodScheduled   False
      Volumes:
        kube-api-access-t6sd6:
          Type:                    Projected (a volume that contains injected data from multiple sources)
          TokenExpirationSeconds:  3607
          ConfigMapName:           kube-root-ca.crt
          ConfigMapOptional:       <nil>
          DownwardAPI:             true
          ConfigMapName:           openshift-service-ca.crt
          ConfigMapOptional:       <nil>
      QoS Class:                   BestEffort
      Node-Selectors:              kubernetes.io/os=windows
      Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                   node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
                                   os=Windows
      Events:
        Type     Reason            Age                  From               Message
        ----     ------            ----                 ----               -------
        Warning  FailedScheduling  176m                 default-scheduler  0/7 nodes are available: 2 node(s) didn't match Pod's node affinity/selector, 2 node(s) were unschedulable, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/7 nodes are available: 7 Preemption is not helpful for scheduling..
        Warning  FailedScheduling  171m                 default-scheduler  0/5 nodes are available: 2 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling..
        Warning  FailedScheduling  15m (x31 over 165m)  default-scheduler  0/5 nodes are available: 2 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling..
      
      oc get serviceaccount -n openshift-windows-machine-config-operator windows-instance-config-daemon -oyaml
      apiVersion: v1
      imagePullSecrets:
      - name: windows-instance-config-daemon-dockercfg-rtwcm
      kind: ServiceAccount
      metadata:
        creationTimestamp: "2024-02-08T09:26:25Z"
        labels:
          olm.managed: "true"
        name: windows-instance-config-daemon
        namespace: openshift-windows-machine-config-operator
        resourceVersion: "56050"
        uid: 5d561b3c-5623-49d9-b4a1-c4df6aac6568
      secrets:
      - name: windows-instance-config-daemon-dockercfg-rtwcm
      
      
      

      Attachments

        Activity

          People

            team-winc Team WinC
            rrasouli Aharon Rasouli
            Aharon Rasouli Aharon Rasouli
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: