Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-3575

Operator shouldn't remove Drainer pods right after them finishing

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • AMQ 7.7.0.CR1
    • operator
    • None

      Copying discussion with rhn-support-rkieley:

      Mikhail Krutov,
      Yesterday 7:07 PM
      ,
      Guys I have a proposition for Drainer. Reusing this chat since want to talk to all of you about it.
      CUrrently when Drainer executes it doesn't leave any leftover - pod is removed after completion.
      Can we not remove it (or at least add a flag to CR that would prevent it from being removed)?
      So it would be in "Completed" pods and logs would be available.
      Mikhail Krutov,
      Yesterday 8:49 PM
      ,
      a quick additional note for this: this requires that Drainer pods would get cleared on scale Up event, because names of pods are not unique and kube requires names in given namespace to be unique.
      So example workflow:
      
      deploy(4)
      message(somemessages)
      scaledown(1)
      <- message migraiton happens, Completed pods exist
      scaleup(2) 
      <- prior to actually scaling up, Operator removes all the old Drainer pods, even thought we're scaling up to 2
      Today
      Roderick Kieley,
      10:52 AM
      ,
      The key requirement here is to be able to access the logs and pod history after the draining occurs correct?
      Mikhail Krutov,
      10:59 AM
      ,
      well, yep pretty much
      Roderick Kieley,
      11:04 AM
      ,
      Not sure if there is another way to satisfy that at the moment or not tbh, but just wanted to be sure. I'm not sure we would want to expose something via the CRD for the above but maybe as a part of some debug-level options it could be done - or another method of collecting that info if it presented itself.
      Either way I think it would be good to log the use case as I think that given how important it is to ensure that we don't lose messages, customers are going to wait this type of functionality as well. i.e. in general we need better reporting and visibility into the success or failure of the drainer.
      On thing that could be a possibility is for the operator to query the drainer as soon as it is 'Ready' and find out how many messages are there, ensure it 'knows' where the messages are going to be migrated, then ensure they got there by checking again in the new home for the right number of messages. We could add a 'verifyMessageMigration' option perhaps or something similar.
      That might require a change to go to a addresses that are prefixed or suffixed with a new name and then subsequently put back where they belong, but we'd know that once doing the development work.
      

            rhn-support-rkieley Roderick Kieley
            mkrutov Mikhail Krutov
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: