Uploaded image for project: 'AMQ Streams'
  1. AMQ Streams
  2. ENTMQST-1337

StorageDiff should handle scaling nodes up and down

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 1.5.0.GA
    • 1.2.0.GA, 1.3.0.GA, 1.4.0.GA
    • cluster-operator
    • None

      The StorageDiff class is handling the blocking of unsupported changes to storage configuration. It allows only changes to things such as volume and delete claim or eventually adding and removing JBOD volumes. This works well.

      But when the user scales the cluster (either Kafka or Zookeeper) up or down, he will logically want to do two different things:

      • add overrides for the newly added brokers / nodes
      • remove the overrides for the brokers which were just scaled down from the cluster

      And the StorageDiff sees this as a change which is not allowed and blocks it. The tricky thing here is that removing or adding overrides for existing nodes should not be allowed. However we should probably allow it for removed nodes or new nodes. So we might need to add support to let users deal with this without overriding the annotations with the storage configuration.

      However, especially for scale-up it will be probably non-trivial to understand where these can be added and when they cannot be added anymore as we will need the information about the current as well as desired nodes. Should we add the old number of nodes to Status?

      To better understand the scenario, imagine deployment like this:

      apiVersion: kafka.strimzi.io/v1beta1
      kind: Kafka
      metadata:
        name: my-cluster
      spec:
        kafka:
          version: 2.2.1
          replicas: 3
          listeners:
            plain: {}
            tls: {}
          config:
            offsets.topic.replication.factor: 3
            transaction.state.log.replication.factor: 3
            transaction.state.log.min.isr: 2
            log.message.format.version: "2.2"
          storage:
            type: jbod
            volumes:
            - id: 0
              type: persistent-claim
              size: 100Gi
              deleteClaim: true
              class: gp2
              overrides:
                - broker: 0
                  class: gp2-a
                - broker: 1
                  class: gp2-b
                - broker: 2
                  class: gp2-c
                - broker: 3
                  class: gp2-a
                - broker: 4
                  class: gp2-b
        zookeeper:
          replicas: 3
          storage:
            type: persistent-claim
            size: 100Gi
            deleteClaim: false
            class: gp2
            overrides:
              - broker: 0
                class: gp2-a
              - broker: 1
                class: gp2-b
              - broker: 2
                class: gp2-c
              - broker: 3
                class: gp2-a
              - broker: 4
                class: gp2-b
        entityOperator:
          topicOperator: {}
          userOperator: {}
      

      Deploy it and scale down the Zookeeper and Broker nodes to 3 and remove the overrides whcih will not be useful anymore for nodes 3 and 4.

              Unassigned Unassigned
              scholzj JAkub Scholz
              Lukas Kral Lukas Kral
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: