Uploaded image for project: 'Multiple Architecture Enablement'
  1. Multiple Architecture Enablement
  2. MULTIARCH-2667

MachineConfigPool is not getting updated

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • None
    • 4.11
    • Multi-Arch
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Note:
      1. if you are dealing with Machines or MachineSet objects, please select the component as "Cloud Compute" under same product.
      2. if you are dealing with kubelet / kubeletconfigs / container runtime configs, please select the component as "Node" under same product.

      Description of problem:
      After deploying OCP on ppc64le environment, when I run:
      [root@rdr-sri-7cfc-tok04-bastion-0 ~]# oc get MachineConfigPool worker
      NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
      worker rendered-worker-627087f67ace055b92ca4c26488fdeb7 False True False 3 2 2 0 18h

      I expect to see READYMACHINECOUNT be 3 but it is 2.

      Then I stopped kubelet.service on one of the nodes (worker). The count became:

      [root@rdr-sri-7cfc-tok04-bastion-0 ~]# oc get MachineConfigPool worker
      NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
      worker rendered-worker-627087f67ace055b92ca4c26488fdeb7 False True False 3 1 2 0 18h

      READYMACHINECOUNT became 1. When I started the service back, the COUNT got back to 2.

      Version-Release number of MCO (Machine Config Operator) (if applicable): 4.11

      Platform (AWS, VSphere, Metal, etc.): IBM Power

      Are you certain that the root cause of the issue being reported is the MCO (Machine Config Operator)?
      (Y/N/Not sure): Yes as I am looking at MachineConfigPool.

      How reproducible:

      Did you catch this issue by running a Jenkins job? If yes, please list:
      1. Jenkins job:

      2. Profile:

      Steps to Reproduce:
      1. As described above.
      2.
      3.

      Actual results:

      Expected results:

      Additional info:

      1. Please consider attaching a must-gather archive (via oc adm must-gather). Please review must-gather contents for sensitive information before attaching any must-gathers to a Bugzilla report. You may also mark the bug private if you wish.

      2. If a must-gather is unavailable, please provide the output of:

      $ oc get co machine-config -o yaml

      $ oc get mcp (and oc describe mcp/${degraded_pool} if pools are degraded)

      $ oc get mc

      $ oc get pod -n openshift-machine-config-operator

      $ oc get node -o wide

      3. If a node is not accessible via API, please provide console/journal/kubelet logs of the problematic node

      4. Are there RHEL nodes on the cluster? If yes, please upload the whole Ansible logs or Jenkins job

              jpoulin Jeremy Poulin
              sridharvenkatraju Sridhar Venkat (Inactive)
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: