-
Story
-
Resolution: Unresolved
-
Major
-
None
-
rhel-8.0.0
-
sst_high_availability
-
ssg_filesystems_storage_and_HA
-
13
-
False
-
-
None
-
None
-
None
-
None
-
If docs needed, set a value
-
-
All
-
None
Description of problem:
When a ticket is newly created, it is not immediately added to the cib, and booth cannot grant it.
In the test below, test RPMs that fix Bug 1768172 are installed. The issue is also reproducible without that fix in place.
Test environment:
~~~
Cluster 1:
fastvm-rhel-8-0-23
fastvm-rhel-8-0-24
Cluster 2:
fastvm-rhel-8-0-33
fastvm-rhel-8-0-34
Arbitrator:
fastvm-rhel-8-0-52
~~~
Defined function to sync booth config:
~~~
booth_sync()
{
SYNC="pcs booth sync"
PULL="pcs booth pull"
LHOST=fastvm-rhel-8-0-23
$SYNC
ssh fastvm-rhel-8-0-52 "$PULL $LHOST"
ssh fastvm-rhel-8-0-33 "$PULL $LHOST && $SYNC"
}
~~~
Demonstration:
~~~
[root@fastvm-rhel-8-0-23 ~]# booth list
[root@fastvm-rhel-8-0-23 ~]# crm_ticket -l
[root@fastvm-rhel-8-0-23 ~]# pcs booth ticket add apacheticket
[root@fastvm-rhel-8-0-23 ~]# booth_sync
Sending booth configuration to cluster nodes...
fastvm-rhel-8-0-24: Booth config saved.
fastvm-rhel-8-0-23: Booth config saved.
Fetching booth config from node 'fastvm-rhel-8-0-23'...
Warning: Booth configuration file '/etc/booth/booth.conf' already exists
Warning: Booth key file '/etc/booth/booth.key' already exists
Booth config saved.
Fetching booth config from node 'fastvm-rhel-8-0-23'...
Warning: Booth configuration file '/etc/booth/booth.conf' already exists
Warning: Booth key file '/etc/booth/booth.key' already exists
Booth config saved.
Sending booth configuration to cluster nodes...
fastvm-rhel-8-0-34: Booth config saved.
fastvm-rhel-8-0-33: Booth config saved.
[root@fastvm-rhel-8-0-23 ~]# pcs constraint ticket add apacheticket apachegroup
[root@fastvm-rhel-8-0-23 ~]# pcs booth ticket grant apacheticket
Error: unable to grant booth ticket 'apacheticket' for site '192.168.22.71', reason: Nov 11 18:00:44 fastvm-rhel-8-0-23 booth: [26687]: error: ticket "apacheticket" does not exist
[root@fastvm-rhel-8-0-23 ~]# pcs cluster cib | grep ticket
<rsc_ticket ticket="apacheticket" rsc="apachegroup" id="ticket-apacheticket-apachegroup"/>
[root@fastvm-rhel-8-0-23 ~]# crm_ticket -l
apacheticket revoked
[root@fastvm-rhel-8-0-23 ~]# pcs cluster stop --all && pcs cluster start --all
fastvm-rhel-8-0-24: Stopping Cluster (pacemaker)...
fastvm-rhel-8-0-23: Stopping Cluster (pacemaker)...
fastvm-rhel-8-0-24: Stopping Cluster (corosync)...
fastvm-rhel-8-0-23: Stopping Cluster (corosync)...
fastvm-rhel-8-0-24: Starting Cluster...
fastvm-rhel-8-0-23: Starting Cluster...
[root@fastvm-rhel-8-0-23 ~]# pcs cluster cib | grep ticket
<rsc_ticket ticket="apacheticket" rsc="apachegroup" id="ticket-apacheticket-apachegroup"/>
<tickets>
<ticket_state id="apacheticket" granted="false" owner="0" expires="1573524113" term="0"/>
</tickets>
[root@fastvm-rhel-8-0-23 ~]# pcs booth ticket grant apacheticket
~~~
Logs show the following for the successful grant:
~~~
Nov 11 18:09:51 fastvm-rhel-8-0-23 boothd-site[27403]: [info] apacheticket (Init/0/0): granting ticket
Nov 11 18:09:51 fastvm-rhel-8-0-23 boothd-site[27403]: [info] apacheticket (Init/0/0): starting new election (term=0)
Nov 11 18:09:51 fastvm-rhel-8-0-23 booth[30557]: [info] grant request sent, waiting for the result ...
Nov 11 18:09:57 fastvm-rhel-8-0-23 boothd-site[27403]: [info] apacheticket (Cndi/0/0): elections finished
Nov 11 18:09:57 fastvm-rhel-8-0-23 boothd-site[27403]: [info] apacheticket (Lead/0/599999): granted successfully here
Nov 11 18:09:57 fastvm-rhel-8-0-23 crm_ticket[30622]: notice: Invoked: crm_ticket -t apacheticket -g --force -S owner -v1950506022 -S expires -v1573525197 -S term -v0
Nov 11 18:09:57 fastvm-rhel-8-0-23 pacemaker-controld[27164]: notice: State transition S_IDLE -> S_POLICY_ENGINE
Nov 11 18:09:57 fastvm-rhel-8-0-23 booth[30557]: [info] grant succeeded!
~~~
The cluster restart produces the following logs, which may be related to whatever triggered the write of the <ticket_state id="apacheticket"> element to the CIB.
~~~
Nov 11 18:01:53 fastvm-rhel-8-0-23 boothd-site[27403]: [info] BOOTH site 1.0 (build 1.0) daemon is starting
Nov 11 18:01:53 fastvm-rhel-8-0-23 boothd-site[27403]: [error] cannot change working directory to /var/lib/booth/cores
Nov 11 18:01:53 fastvm-rhel-8-0-23 crm_ticket[27410]: notice: Invoked: crm_ticket -g -t any-ticket-name
Nov 11 18:01:53 fastvm-rhel-8-0-23 crm_ticket[27410]: warning: Ticket modification not allowed
Nov 11 18:01:53 fastvm-rhel-8-0-23 boothd-site[27403]: [info] New "crm_ticket" found, using atomic ticket updates.
Nov 11 18:01:53 fastvm-rhel-8-0-23 crm_ticket[27428]: notice: Invoked: crm_ticket -t apacheticket -q
Nov 11 18:01:53 fastvm-rhel-8-0-23 crm_ticket[27428]: warning: Could not query ticket XML: No such device or address
Nov 11 18:01:53 fastvm-rhel-8-0-23 boothd-site[27403]: [error] apacheticket (Init/0/0): crm_ticket xml output empty
Nov 11 18:01:53 fastvm-rhel-8-0-23 boothd-site[27403]: [warning] apacheticket: no site matches; site got reconfigured?
Nov 11 18:01:53 fastvm-rhel-8-0-23 boothd-site[27403]: [error] command "crm_ticket -t 'apacheticket' -q" exit code 105
Nov 11 18:01:53 fastvm-rhel-8-0-23 boothd-site[27403]: [info] apacheticket (Init/0/0): broadcasting state query
Nov 11 18:01:53 fastvm-rhel-8-0-23 boothd-site[27403]: [info] BOOTH site daemon started, node id is 0x74425C26 (1950506022).
~~~
Version-Release number of selected component (if applicable):
booth-site-1.0-5.f2d38ce.git.el8.noarch
booth-core-1.0-5.f2d38ce.git.el8.x86_64
pacemaker-2.0.1-4.el8_0.4.x86_64
How reproducible:
Most or all of the time. I think the grant command has to be run fairly soon after the `pcs booth ticket add` and `pcs constraint ticket add` commands in order to observe the issue.
Steps to Reproduce:
1. Start with a fairly clean pair of clusters. No tickets in /etc/booth/booth.conf, no <rsc_ticket> constraints, no <tickets> element.
2. Create a group of dummy resources called apachegroup.
3. Create a ticket in the booth configuration (`pcs booth ticket add apacheticket`).
4. Sync/pull the updated booth configuration to other cluster node(s), the arbitrator, and the other cluster.
5. Optional: Add a ticket constraint on each cluster (`pcs constraint ticket add apacheticket apachegroup`).
6. Attempt to grant the ticket (`pcs booth ticket grant apacheticket`).
Actual results:
[root@fastvm-rhel-8-0-23 ~]# pcs booth ticket grant apacheticket
Error: unable to grant booth ticket 'apacheticket' for site '192.168.22.71', reason: Nov 11 18:00:44 fastvm-rhel-8-0-23 booth: [26687]: error: ticket "apacheticket" does not exist
Expected results:
Successful ticket grant
Additional info:
My booth-core and booth-site test RPMs with the fix for Bug 1768172 (test_atomicity failure) do not resolve this issue.