Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-16304

[FFU 16.2 -> 17.1] "Remove OVNDBs VIP from pacemaker" fails when we try "openstack overcloud upgrade run" multiple times.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • rhos-17.1.z
    • rhos-17.1.z
    • tripleo-ansible
    • None
    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • None
    • RHOS Upgrades 2025 Sprint 5
    • 1
    • Important

      To Reproduce Steps to reproduce the behavior:

      1. Do FFU 16.2 -> 17.1
      2. Run the following command. In this first attempt, "OVNDBs VIP" is removed from pacemaker, but this command fails at subsequent tasks
        $ openstack overcloud upgrade run --yes --stack akb-overcloud --debug --limit allovercloud,undercloud --playbook all 
           :
        ... CHANGED | Remove OVNDBs VIP from pacemaker | ...
           :
        ... FATAL | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | ...
      1. Retry the upgrade, but this time it fails on `Remove OVNDBs VIP from pacemaker` tasks because the OVNDBs VIP has been removed at the first attempt
        $ openstack overcloud upgrade run --yes --stack akb-overcloud --debug --limit allovercloud,undercloud --playbook all 
           :
        ... FATAL | Remove OVNDBs VIP from pacemaker | ... | error={"changed": false, "error": "", "msg": "Failed, to set the resource  to the state delete", "output": "\nUsage: pcs resource delete...\n    delete <resource id|group id|bundle id|clone id>\n        Deletes the resource, group, bundle or clone (and all resources within\n        the group/bundle/clone).\n\n", "rc": 1} 
      1. The error occurs at the following tasks. In the second attempt, the OVNDBs VIP has been removed. Therefore, "ovn_vip.output" will be empty at the "Fetch ovn VIP" task. Then "Remove OVNDBs VIP from pacemaker" task fails because empty is passed to the "pacemaker_resource" module.
                - name: Fetch ovn VIP
                  shell: |
                    pcs constraint list |grep "with ovn-dbs-bundle" |awk '{print $1}'
                  register: ovn_vip
                   :
                - name: Remove OVNDBs VIP from pacemaker
                  when:
                   - step|int == 5
                   - "{{(ovn_dbs_short_bootstrap_node_name|lower == ansible_facts['hostname']|lower)|bool}}"
                  pacemaker_resource:
                    resource: "{{ ovn_vip.stdout }}"
                    state: delete

      Expected behavior

      • "openstack overcloud upgrade run" succeeds even on the second attempt.

      Bug impact

      • This may affect many customers doing the upgrade, and this bug will be a blocker of upgrades

      Known workaround

      • We've not verified yet, but I think we can avoid this issue by modifying THT manually like the following example
        /usr/share/openstack-tripleo-heat-templates/deployment/ovn/ovn-dbs-cluster-ansible.yaml
        
                - name: Remove OVNDBs VIP from pacemaker
                  when:
                   - step|int == 5
                   - "{{(ovn_dbs_short_bootstrap_node_name|lower == ansible_facts['hostname']|lower)|bool}}"
                   - ovn_vip.stdout | length > 0 <==== Add this line
                  pacemaker_resource:
                    resource: "{{ ovn_vip.stdout }}"
                    state: delete 
      • And then rerun `openstack overcloud upgrade prepare` and `openstack overcloud upgrade run`

              jbadiapa@redhat.com Juan Payno
              rhn-support-yatanaka Yamato Tanaka
              rhos-dfg-upgrades
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: