Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-38641

CPMS Periodics not waiting for etcd operator to complete rollouts

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • 4.17.0
    • None
    • None
    • Approved
    • False
    • Hide

      None

      Show
      None

      Description of problem:

          Based on the results in [Sippy|https://sippy.dptools.openshift.org/sippy-ng/component_readiness/test_details?Aggregation=none&Architecture=amd64&Architecture=amd64&FeatureSet=default&FeatureSet=default&Installer=ipi&Installer=ipi&Network=ovn&Network=ovn&NetworkAccess=default&Platform=gcp&Platform=gcp&Scheduler=default&SecurityMode=default&Suite=unknown&Suite=unknown&Topology=ha&Topology=ha&Upgrade=none&Upgrade=none&baseEndTime=2024-06-27%2023%3A59%3A59&baseRelease=4.16&baseStartTime=2024-05-31%2000%3A00%3A00&capability=operator-conditions&columnGroupBy=Platform%2CArchitecture%2CNetwork&component=Etcd&confidence=95&dbGroupBy=Platform%2CArchitecture%2CNetwork%2CTopology%2CFeatureSet%2CUpgrade%2CSuite%2CInstaller&environment=amd64%20default%20ipi%20ovn%20gcp%20unknown%20ha%20none&ignoreDisruption=true&ignoreMissing=false&includeVariant=Architecture%3Aamd64&includeVariant=FeatureSet%3Adefault&includeVariant=Installer%3Aipi&includeVariant=Installer%3Aupi&includeVariant=Owner%3Aeng&includeVariant=Platform%3Aaws&includeVariant=Platform%3Aazure&includeVariant=Platform%3Agcp&includeVariant=Platform%3Ametal&includeVariant=Platform%3Avsphere&includeVariant=Topology%3Aha&minFail=3&pity=5&sampleEndTime=2024-08-19%2023%3A59%3A59&sampleRelease=4.17&sampleStartTime=2024-08-13%2000%3A00%3A00&testId=Operator%20results%3A45d55df296fbbfa7144600dce70c1182&testName=operator%20conditions%20etcd], it appears that the periodic tests are not waiting for the etcd operator to complete before exiting.
      
      The test is supposed to wait for up to 20 mins after the final control plane machine is rolled, to allow operators to settle. But we are seeing the etcd operator triggering 2 further revisions after this happens.
      
      We need to understand if the etcd operator is correctly rolling out vs whether these changes should have rolled out prior to the final machine going away, and, understand if there's a way to add more stability to our checks to make sure that all of the operators stabilise, and, that they have been stable for at least some period (1 minute)

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

          

      Expected results:

          

      Additional info:

          

            joelspeed Joel Speed
            joelspeed Joel Speed
            Zhaohua Sun Zhaohua Sun
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: