Uploaded image for project: 'Managed Service - Streams'
  1. Managed Service - Streams
  2. MGDSTRM-10045

Rationalise CPU utilisation in order to allow for strimzipodset enablement

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Won't Do
    • Icon: Critical Critical
    • None
    • None
    • None
    • None
    • False
    • None
    • False
    • No
    • MGDSRVS-336 - Keep Openshift Streams components up-to-date
    • ---
    • ---

      WHAT

      As discussed on MGDSTRM-9976, RHOSAK currently configures the strimzi cluster operator deployment to have 1 replica with 3 cpus assigned to the container.

      When running under StrimziPodSets, the recommendation to achieve high availability is to have at least two replicas.

      If we were to simply increase the number of replicas, with the current CPU assignment that would result in the following CPU consumption, which is excessive.

      2 (strimzi deployments per bundle (old/new)) * 2 (strimzi cluster operator replicas) * 3 cpus = 12 cpus per cluster.

      The current cpu limit comes from Red Hat Summit when we were testing with very large numbers of kafka instances. Strimzi has a thundering herd problem - the reconciliations occur in wave that gives a spikey CPU usage pattern. cpus=3 was trying to give sufficient CPU to deal accommodate the reconciliation spike.

      To allow us to move forward with podsets, we can probably tune down the CPU limits to be commensurate with current demands.

      Longer term: lets work to resolve https://github.com/strimzi/strimzi-kafka-operator/issues/7373 which should allow CPU demands to be reduced further (not part of this JIRA).

      WHY

      Enablement of Strimzi PodSets in a manner that achieves high availability.

      HOW

      • Work out what CPU demands will look like under podsets, given the worst case production workload.
      • We should look at both development and standard.

      DONE

            Unassigned Unassigned
            keithbwall Keith Wall
            Kafka Integrations
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: