-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.18
-
None
-
None
-
False
-
Description of problem:
In an effort to ensure all HA components are not degraded by design during normal e2e test or upgrades, we are collecting all operators that are blipping Degraded=True during any payload job run. This card captures etcd operator that blips Degraded=True during upgrade runs. There are multiple reasons seen. If you feel there is a need to split the bug into multiple, feel free to do it. Example Job: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-rt-upgrade/1843561647470284800 The following are the Reasons seen so far: 1. EtcdCertSignerController_Error::EtcdEndpoints_ErrorUpdatingEtcdEndpoints::EtcdMembers_UnhealthyMembers::TargetConfigController_SynchronizationError 2. EtcdCertSignerController_Error::EtcdEndpoints_ErrorUpdatingEtcdEndpoints::EtcdMembers_UnhealthyMembers::NodeController_MasterNodesReady::TargetConfigController_SynchronizationError 3. NodeController_MasterNodesReady::StaticPods_Error 4. Unknown 5. EtcdCertSignerController_Error::EtcdEndpoints_ErrorUpdatingEtcdEndpoints::EtcdMembers_UnhealthyMembers::NodeController_MasterNodesReady::StaticPods_Error::TargetConfigController_SynchronizationError 6. EtcdCertSignerController_Error::EtcdEndpoints_ErrorUpdatingEtcdEndpoints::TargetConfigController_SynchronizationError 7. EtcdMembers_UnhealthyMembers 8. EtcdEndpoints_ErrorUpdatingEtcdEndpoints::EtcdMembers_UnhealthyMembers::TargetConfigController_SynchronizationError 9. EtcdCertSignerController_Error::EtcdEndpoints_ErrorUpdatingEtcdEndpoints::EtcdMembersController_ErrorUpdatingReportEtcdMembers::EtcdMembers_UnhealthyMembers::TargetConfigController_SynchronizationError For now, we put an exception in the test. But it is expected that teams take action to fix those and remove the exceptions after the fix go in. Exceptions are defined here: https://github.com/openshift/origin/blob/e5e76d7ca739b5699639dd4c500f6c076c697da6/pkg/monitortests/clusterversionoperator/legacycvomonitortests/operators.go#L331 See linked issue for more explanation on the effort.
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info:
- relates to
-
TRT-1578 Ensure all HA components are not degraded by design during upgrades
- New