Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: AMQ 7.10.0.GA
Component/s: broker-core, clustering
Labels:
None

Blocked:
False
Blocked Reason:
None
Ready:
False
GSS Priority:
Steps to Reproduce:
Hide

1. Configure a 3-node replicated HA cluster using hostnames for connector / acceptor configuration
2. Configure a DNS server with the hostnames of the 3 nodes
3. Edit /etc/resolv.conf on each of the broker nodes to use only the DNS server in (2)
4. Make sure /etc/hosts contains only the default entries and not the hostnames of the broker hosts
5. Start the cluster and inspect the topology to ensure that it correctly reports 6 nodes / 3 members
6. Interrupt the connectivity to the DNS server for some time. I used a script like this one on the DNS server to accomplish this:

#!/bin/sh echo "Sleeping..." sleep 10 ip link set eth0 down; echo "Killing network..." sleep 150 ip link set eth0 up;

6. While the DNS connectivity is down, restart a live and backup node in the cluster
7. Wait for DNS connectivity to be restored
8. Monitor the cluster to see if it is fully restored following the outage

It may take a few tries, but eventually you get to a state where the cluster is missing members, even after DNS connectivity is restored and some time passes.
Show
1. Configure a 3-node replicated HA cluster using hostnames for connector / acceptor configuration 2. Configure a DNS server with the hostnames of the 3 nodes 3. Edit /etc/resolv.conf on each of the broker nodes to use only the DNS server in (2) 4. Make sure /etc/hosts contains only the default entries and not the hostnames of the broker hosts 5. Start the cluster and inspect the topology to ensure that it correctly reports 6 nodes / 3 members 6. Interrupt the connectivity to the DNS server for some time. I used a script like this one on the DNS server to accomplish this: #!/bin/sh echo "Sleeping..." sleep 10 ip link set eth0 down; echo "Killing network..." sleep 150 ip link set eth0 up; 6. While the DNS connectivity is down, restart a live and backup node in the cluster 7. Wait for DNS connectivity to be restored 8. Monitor the cluster to see if it is fully restored following the outage It may take a few tries, but eventually you get to a state where the cluster is missing members, even after DNS connectivity is restored and some time passes.
Intelligence Requested:
Market:

Severity:
Important

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

If a live or backup broker is restarted during a DNS outage, it does not reliably rejoin the cluster, even after DNS service is restored and even with cluster reconnect-attempts set to -1 / unlimited. Instead the topology remains broken until a restart of the cluster is performed after DNS service is restored.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Hide
reproducer.zip
2022/12/11 12:17 AM
20 kB
Duane Hawkins
Extracting archive...
Show
reproducer.zip
2022/12/11 12:17 AM
20 kB
Duane Hawkins

Assignee:: Unassigned

Reporter:: Duane Hawkins

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2022/12/10 11:56 PM

Updated:: 2023/04/04 8:33 AM

Details

Description

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates