Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-18526

[4.12][sig-arch] Check if alerts are firing during or after upgrade success - alert TargetDown fired on openshift-authentication/oauth-openshift and openshift-kube-controller-manager-operator/metrics

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • None
    • 4.12.z
    • apiserver-auth
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • No
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      [sig-arch] Check if alerts are firing during or after upgrade success is failing in 4.12 z-stream upgrade-rollback job. The job is testing z stream rollback by installing 4.12.0, updating towards a recent 4.12 nightly, and then, at some random point during that update, rolling back to 4.12.0.
      
      The error is:
      
      {Sep  5 02:43:49.260: Unexpected alerts fired or pending during the upgrade:
      
      alert TargetDown fired for 1830 seconds with labels: {job="oauth-openshift", namespace="openshift-authentication", service="oauth-openshift", severity="warning"}
      alert TargetDown fired for 300 seconds with labels: {job="metrics", namespace="openshift-kube-controller-manager-operator", service="metrics", severity="warning"}
      alert TargetDown fired for 300 seconds with labels: {job="metrics", namespace="openshift-kube-scheduler-operator", service="metrics", severity="warning"} Failure Sep  5 02:43:49.260: Unexpected alerts fired or pending during the upgrade:
      
      alert TargetDown fired for 1830 seconds with labels: {job="oauth-openshift", namespace="openshift-authentication", service="oauth-openshift", severity="warning"}
      alert TargetDown fired for 300 seconds with labels: {job="metrics", namespace="openshift-kube-controller-manager-operator", service="metrics", severity="warning"}
      alert TargetDown fired for 300 seconds with labels: {job="metrics", namespace="openshift-kube-scheduler-operator", service="metrics", severity="warning"}
      
      github.com/openshift/origin/test/extended/util/disruption.(*chaosMonkeyAdapter).Test(0xc00590caa0, 0xc001287f38)
      	github.com/openshift/origin/test/extended/util/disruption/disruption.go:197 +0x315
      k8s.io/kubernetes/test/e2e/chaosmonkey.(*Chaosmonkey).Do.func1()
      	k8s.io/kubernetes@v1.25.0/test/e2e/chaosmonkey/chaosmonkey.go:94 +0x6a
      created by k8s.io/kubernetes/test/e2e/chaosmonkey.(*Chaosmonkey).Do
      	k8s.io/kubernetes@v1.25.0/test/e2e/chaosmonkey/chaosmonkey.go:91 +0x8b}
      
      Failed job: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.12-e2e-aws-ovn-upgrade-rollback-oldest-supported/1698859751263178752

      Version-Release number of selected component (if applicable):

      4.12.0-0.nightly-2023-09-05-004152

      How reproducible:

      Flaky

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

      {Sep  5 02:43:49.260: Unexpected alerts fired or pending during the upgrade:
      
      alert TargetDown fired for 1830 seconds with labels: {job="oauth-openshift", namespace="openshift-authentication", service="oauth-openshift", severity="warning"}
      alert TargetDown fired for 300 seconds with labels: {job="metrics", namespace="openshift-kube-controller-manager-operator", service="metrics", severity="warning"}
      alert TargetDown fired for 300 seconds with labels: {job="metrics", namespace="openshift-kube-scheduler-operator", service="metrics", severity="warning"} Failure Sep  5 02:43:49.260: Unexpected alerts fired or pending during the upgrade:
      
      alert TargetDown fired for 1830 seconds with labels: {job="oauth-openshift", namespace="openshift-authentication", service="oauth-openshift", severity="warning"}
      alert TargetDown fired for 300 seconds with labels: {job="metrics", namespace="openshift-kube-controller-manager-operator", service="metrics", severity="warning"}
      alert TargetDown fired for 300 seconds with labels: {job="metrics", namespace="openshift-kube-scheduler-operator", service="metrics", severity="warning"}
      
      github.com/openshift/origin/test/extended/util/disruption.(*chaosMonkeyAdapter).Test(0xc00590caa0, 0xc001287f38)
      	github.com/openshift/origin/test/extended/util/disruption/disruption.go:197 +0x315
      k8s.io/kubernetes/test/e2e/chaosmonkey.(*Chaosmonkey).Do.func1()
      	k8s.io/kubernetes@v1.25.0/test/e2e/chaosmonkey/chaosmonkey.go:94 +0x6a
      created by k8s.io/kubernetes/test/e2e/chaosmonkey.(*Chaosmonkey).Do
      	k8s.io/kubernetes@v1.25.0/test/e2e/chaosmonkey/chaosmonkey.go:91 +0x8b}

      Expected results:

       

      Additional info:

       

              slaznick@redhat.com Stanislav Láznička (Inactive)
              yanyang@redhat.com Yang Yang
              None
              None
              Xingxing Xia Xingxing Xia
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: