• Icon: Sub-task Sub-task
    • Resolution: Done
    • Icon: Blocker Blocker
    • openshift-4.8, openshift-4.9
    • None
    • None
    • False
    • False
    • Undefined

      4.8 etcd operator watches ClusterVersion, and when it sees a Failing=True condition whose message contains the substring RecentBackup:

      1. It starts taking a backup
      2. It sets a RecentBackup=Unknown condition in its ClusterOperator, with a reason/message about progressing.
      3. If the backup…
      • succeeds, it sets RecentBackup=True.
        • If the CVO gets stuck on something else (e.g. release signature fetching), maybe enough time passes that the etcd operator thinks the backup is stale.  It can set RecentBackup=False with a reason/message complaining about that.  Re-trigger is probably oc adm upgrade --clear and try again.  Hopefully rare, because there will be no web-console re-trigger support.
      • fails, it sets RecentBackup=False.  Maybe tries again?  If we don't auto-retry, admin could clear/re-request desiredUpdate to trigger a new attempt.
      1. 4.9 etcd operator in all cases, and 4.8 etcd operator that sees a ClusterVersion Failing condition that is not True or whose message does not contain RecentBackup, clear the RecentBackup condition from its ClusterOperator.

            sbatsche@redhat.com Sam Batschelet (Inactive)
            mnewby@redhat.com Maru Newby (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: