Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-7821

mariadb-operator may not reconcile galera start properly when it is restarted

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • mariadb-operator
    • False
    • Hide

      None

      Show
      None
    • False
    • Committed
    • ?
    • ?
    • ?
    • Moderate

      Observed when deploying a single-node galera in CI.

      . If the galera pod get restarted while the mariadb operator pod is not running, (e.g. due to a crash or environmental issue), the operator does not get notified of the stop, and the bootstrap status associated to the galera CR is not reasserted when the operator gets restarted.

      . This leads to the operator having a wrong view of the current state of the galera cluster.

      . Now if the pod restarts past this point, the operator will tell it to rejoin the galera cluster, instead of telling it to bootstrap a new cluster. This will not work a nd the pod will eventually fail to start.

      . When the pod fails, k8s recreates a new pod and the operator is finally notified of this creation. It then reassess the state of the cluster (no pod is running, cluster not bootstrapped), and it can tell the newly created pod to bootstrap the cluster.

       

      This whole sequence of event could be avoided if the operator always reassess the state of the galera cluster of startup.

            rhn-engineering-dciabrin Damien Ciabrini
            rhn-engineering-dciabrin Damien Ciabrini
            rhos-dfg-pidone
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: