Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-3858

[LTS] Prometheus shows inconsistent figures in master-slave, shared-store configuration

XMLWordPrintable

    • +
    • Previously, in a live/backup configuration after fail-over and fail-back the backup broker would report metrics even though it was inactive because the internal metrics manager didn't properly unregister metrics. This issue is now resolved.
    • Verified in a release
    • Hide

      1. Install AMQ 7.5

      2. Create master and slave instances using shared file store.

      3. Install the Prometheus agent, as described in the Red Hat documentation[1]

      4. Start master and slave

      5. Produce a known number of messages to a particular queue on the master (the slave will not be accessible to clients)

      6. Stop the master

      7. Examine the /metrics URL on the slave's console. Note that the number of messages on the chosen destination is correct, as the slave is now live and has control of the store

      8. Produce a known number of additional messages to the slave (the master is currently down).

      9. Restart the master. The master will become live, and the master and slave are both running

      10. Produce more messages to the master

      11. Look at the /metrics URL for both master and slave. The message counts are different. The slave shows the count it had when it was last live, not the current count. The master should show the correct count.

      12. Note, however, that there is no way to tell, just from Prometheus metrics, which broker's figures are authoritative.

       

      [1] https://access.redhat.com/documentation/en-us/red_hat_amq/7.6/html/managing_amq_broker/prometheus-plugin-managing

       

      Show
      1. Install AMQ 7.5 2. Create master and slave instances using shared file store. 3. Install the Prometheus agent, as described in the Red Hat documentation [1] 4. Start master and slave 5. Produce a known number of messages to a particular queue on the master (the slave will not be accessible to clients) 6. Stop the master 7. Examine the /metrics URL on the slave's console. Note that the number of messages on the chosen destination is correct, as the slave is now live and has control of the store 8. Produce a known number of additional messages to the slave (the master is currently down). 9. Restart the master. The master will become live, and the master and slave are both running 10. Produce more messages to the master 11. Look at the /metrics URL for both master and slave. The message counts are different. The slave shows the count it had when it was last live, not the current count. The master should show the correct count. 12. Note, however, that there is no way to tell, just from Prometheus metrics, which broker's figures are authoritative.   [1] https://access.redhat.com/documentation/en-us/red_hat_amq/7.6/html/managing_amq_broker/prometheus-plugin-managing  

      In a shared-store, master-slave configuration, only the master broker will have access to the message store. However, the slave broker still shows store-related Prometheus metrics such as message counts. These metrics are likely to be incorrect; at best, they will show the values that were correct the last time the slave was live.

      In a shared-store scenario, only the live broker will have reliable store-related metrics. The backup broker – whether master or slave – should show zero, or nothing at all.

      To make matters worse, the Prometheus metrics do not seem to provide the administrator with any way to determine that a broker is in the backup role. Consequently, the administrator won't know which broker's metrics to trust.

       

              rhn-support-jbertram Justin Bertram
              dbruscin Domenico Francesco Bruscino
              Tiago Bueno Tiago Bueno
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: