Uploaded image for project: 'OpenShift Cloud Credential Operator'
  1. OpenShift Cloud Credential Operator
  2. CCO-341

[bz-Image Registry] clusteroperator/image-registry should not change condition/Available

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • None
    • 4.10
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2081552

      Description of problem:

      [bz-Image Registry] clusteroperator/image-registry should not change condition/Available is failing frequently in CI, see [1] and: 

      $ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=24h&type=junit&search=image-registry+should+not+change+condition/Available' | grep 'failures match' | sort
      periodic-ci-openshift-multiarch-master-nightly-4.10-upgrade-from-nightly-4.9-ocp-remote-libvirt-ppc64le (all) - 2 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-multiarch-master-nightly-4.10-upgrade-from-nightly-4.9-ocp-remote-libvirt-s390x (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-multiarch-master-nightly-4.11-ocp-e2e-aws-arm64-techpreview-serial (all) - 6 runs, 67% failed, 25% of failures match = 17% impact
      periodic-ci-openshift-multiarch-master-nightly-4.11-upgrade-from-nightly-4.10-ocp-remote-libvirt-ppc64le (all) - 2 runs, 50% failed, 100% of failures match = 50% impact
      periodic-ci-openshift-multiarch-master-nightly-4.8-upgrade-from-nightly-4.7-ocp-remote-libvirt-s390x (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-multiarch-master-nightly-4.9-upgrade-from-nightly-4.8-ocp-remote-libvirt-s390x (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-ovirt-upgrade (all) - 4 runs, 50% failed, 200% of failures match = 100% impact
      periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-vsphere-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-release-master-ci-4.11-e2e-aws-upgrade-single-node (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-release-master-ci-4.11-e2e-azure-upgrade-single-node (all) - 2 runs, 100% failed, 50% of failures match = 50% impact
      periodic-ci-openshift-release-master-ci-4.11-e2e-gcp-ovn (all) - 2 runs, 50% failed, 100% of failures match = 50% impact
      periodic-ci-openshift-release-master-ci-4.11-e2e-gcp-techpreview-serial (all) - 2 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-release-master-ci-4.11-e2e-gcp-upgrade (all) - 60 runs, 25% failed, 53% of failures match = 13% impact
      periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-gcp-ovn-rt-upgrade (all) - 4 runs, 100% failed, 25% of failures match = 25% impact
      periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-gcp-ovn-upgrade (all) - 21 runs, 100% failed, 43% of failures match = 43% impact
      periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-ovirt-upgrade (all) - 4 runs, 75% failed, 133% of failures match = 100% impact
      periodic-ci-openshift-release-master-ci-4.8-e2e-aws-upgrade-single-node (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-release-master-ci-4.8-e2e-azure-upgrade-single-node (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-ovirt-upgrade (all) - 4 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-vsphere-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-ovirt-upgrade (all) - 4 runs, 75% failed, 100% of failures match = 75% impact
      periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-vsphere-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-release-master-nightly-4.10-e2e-vsphere-upi-serial (all) - 3 runs, 33% failed, 100% of failures match = 33% impact
      periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.9-e2e-metal-ipi-upgrade-ovn-ipv6 (all) - 2 runs, 100% failed, 50% of failures match = 50% impact
      periodic-ci-openshift-release-master-nightly-4.11-e2e-gcp (all) - 2 runs, 100% failed, 50% of failures match = 50% impact
      periodic-ci-openshift-release-master-nightly-4.11-e2e-gcp-rt (all) - 2 runs, 50% failed, 100% of failures match = 50% impact
      periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-serial-ovn-dualstack (all) - 2 runs, 50% failed, 100% of failures match = 50% impact
      periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-upgrade (all) - 3 runs, 100% failed, 100% of failures match = 100% impact
      periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-upgrade-ovn-ipv6 (all) - 3 runs, 100% failed, 33% of failures match = 33% impact
      periodic-ci-openshift-release-master-nightly-4.11-upgrade-from-stable-4.10-e2e-metal-ipi-upgrade-ovn-ipv6 (all) - 3 runs, 100% failed, 33% of failures match = 33% impact
      periodic-ci-openshift-release-master-nightly-4.9-e2e-vsphere-upi-serial (all) - 2 runs, 50% failed, 100% of failures match = 50% impact
      periodic-ci-openshift-release-master-okd-4.10-e2e-vsphere (all) - 5 runs, 80% failed, 25% of failures match = 20% impact
      pull-ci-openshift-machine-config-operator-release-4.10-e2e-aws-upgrade-single-node (all) - 2 runs, 100% failed, 100% of failures match = 100% impact
      pull-ci-openshift-machine-config-operator-release-4.10-e2e-vsphere-upgrade (all) - 2 runs, 50% failed, 100% of failures match = 50% impact
      pull-ci-openshift-origin-master-e2e-aws-single-node-upgrade (all) - 9 runs, 78% failed, 100% of failures match = 78% impact
      
      For example, [2] has:
      
        : [bz-Image Registry] clusteroperator/image-registry should not change condition/Available
          Run #0: Failed	2h23m7s
          {  2 unexpected clusteroperator state transitions during e2e test run 
      
          May 03 14:28:58.817 - 3490s E clusteroperator/image-registry condition/Available status/False reason/Available: The deployment does not have available replicas\nNodeCADaemonAvailable: The daemon set node-ca has available replicas\nImagePrunerAvailable: Pruner CronJob has been created
      2 tests failed during this blip (2022-05-03 14:28:58.817414102 +0000 UTC to 2022-05-03 14:28:58.817414102 +0000 UTC): [sig-apps] StatefulSet Basic StatefulSet functionality [StatefulSetBasic] should provide basic identity [Suite:openshift/conformance/parallel] [Suite:k8s]
      [sig-apps] StatefulSet Basic StatefulSet functionality [StatefulSetBasic] should adopt matching orphans and release non-matching pods [Suite:openshift/conformance/parallel] [Suite:k8s]}
      
      With:
      
        $ curl -s https://storage.googleapis.com/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-gcp-upgrade/1521470801340010496/build-log.txt | grep 'clusteroperator/image-registry condition/Available.*changed'
        May 03 14:28:58.817 E clusteroperator/image-registry condition/Available status/False reason/NoReplicasAvailable changed: Available: The deployment does not have available replicas\nNodeCADaemonAvailable: The daemon set node-ca has available replicas\nImagePrunerAvailable: Pruner CronJob has been created
        May 03 15:27:08.975 W clusteroperator/image-registry condition/Available status/True reason/MinimumAvailability changed: Available: The registry has minimum availability\nNodeCADaemonAvailable: The daemon set node-ca has available replicas\nImagePrunerAvailable: Pruner CronJob has been created
      
      The test-case is flake-only, so this isn't impacting CI success rates.  But having the operator claim Available=False is not a great customer experience. Possibly not a big enough UX impact to be worth backports, but certainly a big enough UX impact to be worth fixing in the development branch.
      
      [1]: https://sippy.ci.openshift.org/sippy-ng/tests/4.11/analysis?test=%5Bbz-Image%20Registry%5D%20clusteroperator%2Fimage-registry%20should%20not%20change%20condition%2FAvailable
      [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-gcp-upgrade/1521470801340010496
      

              Unassigned Unassigned
              trking W. Trevor King
              None
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: