Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-29249

CPMS leaves only 2 masters during update

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • 4.16.0
    • 4.13, 4.12, 4.14, 4.15, 4.16
    • None
    • No
    • CLOUD Sprint 249
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, when a control plane machine was marked as unready and a change was initiated by the modifying the control plane machine set, the unready machine was removed prematurely.
      This premature action caused multiple indexes to be replaced simultaneously.
      With this release, the control plane machine set no longer deletes a machine when only a single machine exists within the index.
      This change prevents premature roll-out of changes and prevents more than one index from being replaced at a time.
      (link:https://issues.redhat.com/browse/OCPBUGS-29249[*OCPBUGS-29249*])
      Show
      * Previously, when a control plane machine was marked as unready and a change was initiated by the modifying the control plane machine set, the unready machine was removed prematurely. This premature action caused multiple indexes to be replaced simultaneously. With this release, the control plane machine set no longer deletes a machine when only a single machine exists within the index. This change prevents premature roll-out of changes and prevents more than one index from being replaced at a time. (link: https://issues.redhat.com/browse/OCPBUGS-29249 [* OCPBUGS-29249 *])
    • Bug Fix
    • Done

      Observed during testing of candidate-4.15 image as of 2024-02-08.

      This is an incomplete report as I haven't verified the reproducer yet or attempted to get a must-gather. I have observed this multiple times now, so I am confident it's a thing. I can't be confident that the procedure described here reliably reproduces it, or that all the described steps are required.

      I have been using MCO to apply machine config to masters. This involves a rolling reboot of all masters.

      During a rolling reboot I applied an update to CPMS. I observed the following sequence of events:

      • master-1 was NotReady as it was rebooting
      • I modified CPMS
      • CPMS immediately started provisioning a new master-0
      • CPMS immediately started deleting master-1
      • CPMS started provisioning a new master-1

      At this point there were only 2 nodes in the cluster:

      • old master-0
      • old master-2

      and machines provisioning:

      • new master-0
      • new master-1

            joelspeed Joel Speed
            rhn-gps-mbooth Matthew Booth
            Huali Liu Huali Liu
            Jeana Routh Jeana Routh
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: