Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-11369

CPMS e2e periodics tests timeout failures

    XMLWordPrintable

Details

    • No
    • CLOUD Sprint 234
    • 1
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      In the control plane machine set operator we perform e2e periodic tests that check the ability to do a rolling update of an entire OCP control plane.

      This is a quite involved test as we need to drain and replace all the master machines/nodes, altering operators, waiting for machines to come up + bootstrap and nodes to drain and move their workloads to others while respecting PDBs, and etcd quorum.

      As such we need to make sure we are robust to transient issues, occasionaly slow-downs and network errors.

      We have investigated these timeout issues and identified some common culprits that we want to address, see: https://redhat-internal.slack.com/archives/GE2HQ9QP4/p1678966522151799

      Attachments

        Issue Links

          Activity

            People

              ddonati@redhat.com Damiano Donati
              ddonati@redhat.com Damiano Donati
              Zhaohua Sun Zhaohua Sun
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: