Uploaded image for project: 'OpenShift Top Level Product Strategy'
  1. OpenShift Top Level Product Strategy
  2. OCPPLAN-7579

add support for updating etcd-endpoints configmap based on health checks

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • False
    • False
    • Not Set
    • No
    • Not Set
    • Service Delivery
    • Not Set
    • Not Set
    • Undefined
    • OSD

      Today cluster-etcd-operator assigns all etcd pods to the endpoints list consumed by apiserver. This in someways is the desired solution as we can allow the client balancer to handle health. The problem with this solution is there are network partition situations where the balancer can get stuck on a specific endpoint.

      This is true because the client balancer only checks if the gRPC conn is Ready and while it can be ready that does not preclude it has quorum. So if apiserver somehow could contact the local etcd but it was partitioned from its peers the balancer would still use that endpoint in round robin.

      controller: etcd endpoints controller
      resource: etcd-endpoints configmap

      risks: change in this list today results in a new static pod revision for the apiserver. we need to ensure this does not flap.

      alternatives: find a way to ensure the client balancer for etcd can also understand etcd quorum health of the subconn.

              Unassigned Unassigned
              sbatsche@redhat.com Sam Batschelet (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: