Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Blocker
Fix Version/s: rhos-16.2.9
Affects Version/s: rhos-16.2.z
Component/s: tripleo-ansible
Labels:
- Triaged

Story Points:
3
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Bugzilla Bug:
RHBZ: 2319438
Fixed in Build:
tripleo-ansible-0.8.1-2.20250429125110.123ce73.el8ost
AssignedTeam:
rhos-ops-day1day2-upgrades
Regression:
None
Intelligence Requested:
Market:
Release Blocker:
Approved
Errata Link:
https://errata.engineering.redhat.com/advisory/149919

Sprint:
RHOS Upgrades 2025 Sprint 5
sprint_count:
1
Severity:
Moderate

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:
Two different customers recently reported same problem blocking minor upgrade of RHOSP 16.2 deployments: haproxy failed to start after pacemaker service was stopped and started, then unused podman images were removed and haproxy image was removed as well, then pacemaker tried to start haproxy again when another controller node was updated. As a result, some of haproxy services are unable to start because there is no proper image and cluster is unhealthy state:

haproxy-bundle-podman-1_start_0 on controller-0 'error' (1): call=140, status='complete', exitreason='failed to pull image cluster.common.tag/openstack-haproxy:pcmklatest', last-rc-change='2024-08-27 20:17:10Z', queued=0ms, exec=420ms

HAProxy startup failures were different in two cases, so IMO it is right to focus on DF here. But this problem also can be investigated from HAProxy perspective.

Version-Release number of selected component (if applicable): RHOSP 16.2

How reproducible: sporadically reproduced during upgrade

Actual results: upgrade is not interrupted if haproxy fails to start properly, and next sequence of calls breaks haproxy bundle after removing proper image

Expected results: engineering knows better how to handle this

Workaround: to repeat failed (upgrade run) command: it will start problematic containers and proceed with update.

is cloned by

OSPRH-14460 BZ#2319438 HAProxy container failures during minor upgrade procedure is ignored by TripleO, but blocks cluster operations later

Closed

external trackers

Red Hat Customer Portal 03916001

Red Hat Customer Portal 03924418

links to

OSP17.1 trunk patches review

RHBA-2025:149919 Red Hat OpenStack Platform 16.2.9 bug fix advisory

mentioned on

Merge request - [update] Rework image pruning

Merge request - [update][upgrade] Fixup podman image pruning

(2 mentioned on)

Assignee:: Lukas Bezdicka

Reporter:: RH Bugzilla Integration

Contributors:: Juan Payno

QA Contact:: Archana Singh

Team:: rhos-dfg-upgrades

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Created:: 2024/10/17 5:22 PM

Updated:: 2025/09/13 5:16 PM

Resolved:: 2025/07/16 1:01 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty