Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-10771

upgrade test failure with "Cluster operator control-plane-machine-set is not available"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • 4.13, 4.14
    • None
    • Important
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, when a machine entered the `Running` state, no further changes to the state of its node were checked for. The previous resolution of link:https://issues.redhat.com/browse/OCPBUGS-8424[OCPBUGS-8424] introduced the requirement for a node and its machine to be in the `Ready` state for the control plane machine set replica to be considered ready. As a result, if the control plane machine set missed the stage when the node and machine were ready, its replica could not become ready. This caused the Control Plane Machine Set Operator to become unavailable, blocking upgrades. With this release, when a machine is running but the node is not ready, the node is checked at regular intervals until it becomes ready. This prevents the Control Plane Machine Set Operator from becoming unavailable and blocking upgrades.
      (link:https://issues.redhat.com/browse/OCPBUGS-10771[*OCPBUGS-10771*])
      Show
      * Previously, when a machine entered the `Running` state, no further changes to the state of its node were checked for. The previous resolution of link: https://issues.redhat.com/browse/OCPBUGS-8424 [ OCPBUGS-8424 ] introduced the requirement for a node and its machine to be in the `Ready` state for the control plane machine set replica to be considered ready. As a result, if the control plane machine set missed the stage when the node and machine were ready, its replica could not become ready. This caused the Control Plane Machine Set Operator to become unavailable, blocking upgrades. With this release, when a machine is running but the node is not ready, the node is checked at regular intervals until it becomes ready. This prevents the Control Plane Machine Set Operator from becoming unavailable and blocking upgrades. (link: https://issues.redhat.com/browse/OCPBUGS-10771 [* OCPBUGS-10771 *])
    • Bug Fix
    • Done

      This is a clone of issue OCPBUGS-10032. The following is the description of the original issue:

      Description of problem:

      test "operator conditions control-plane-machine-set" fails https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_ovn-kubernetes/1574/pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-upgrade/1634410710559625216
      control-plane-machine-set operator is Unavailable, because it doesn't reconcile node events. If a node becomes ready later than the referencing Machine, Node update event will not trigger reconciliation.

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      depends on the sequence of Node vs Machine events

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

      operator logs 
      https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_ovn-kubernetes/1574/pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-upgrade/1634410710559625216/artifacts/e2e-aws-ovn-upgrade/gather-extra/artifacts/pods/openshift-machine-api_control-plane-machine-set-operator-5d5848c465-g4q2p_control-plane-machine-set-operator.log
      
      machines 
      https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_ovn-kubernetes/1574/pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-upgrade/1634410710559625216/artifacts/e2e-aws-ovn-upgrade/gather-extra/artifacts/machines.json
      
      nodes 
      https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_ovn-kubernetes/1574/pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-upgrade/1634410710559625216/artifacts/e2e-aws-ovn-upgrade/gather-extra/artifacts/nodes.json

            ddonati@redhat.com Damiano Donati
            openshift-crt-jira-prow OpenShift Prow Bot
            Milind Yadav Milind Yadav
            Jeana Routh Jeana Routh
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: