Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: 6.16.3
Affects Version/s: 6.16.0
Component/s: Pulp, Upgrades
Labels:

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Fixed in Build:
pulpcore-3.49.32, pulpcore-3.63.8, pulpcore-3.70.0
PM Score:
0
Git Pull Request:
https://github.com/theforeman/foreman_maintain/commit/fdb2c09d62b19f3669e2927333f20bb59fa25b8a
BZ Keywords:
- Unset
Intelligence Requested:
Market:

Severity:
Moderate

Target Version:

6.16.3

Test Coverage:

To Do
Regression:
None

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

PX Impact Score:
PX Review Complete:

Description of problem:
Under a specific conditions, an upgrade to 6.16 can leave pulp tasking system with a hung "running" task. That task blocks either whole Capsule sync (if problem happened on Caps), or some repo sync on Sat (when problem appeared on Sat).

Particular scenario:

1) pulp task is started before the migration https://github.com/pulp/pulpcore/blob/main/pulpcore/app/migrations/0117_task_unblocked_at.py#L12-L17
2) pulp services stopped but the task remains in state='running' (this is tricky but it can happen) - if we were starting pulp back, it would detect the task with gone worker and mark it failed
3) The migration adds `unblocked_at=null` for all tasks, including the running one
4) the upgraded pulp code treats only unblocked tasks (https://github.com/pulp/pulpcore/blob/main/pulpcore/tasking/worker.py#L314) for further running or cancelling
5) As an outcome, we have a "running" task with no worker, nulled `unblocked_at`, but acquiring some shared resources - so further tasks of the same type are hung waiting on this, forever

When this happens on Satellite during a repo sync, that repo would be blocked in further synchronization.

When this happens on Capsule, then subsequent Caps sync (attempting to sync same repo with hung synchronization) will hang in RefreshRepos initial step, which practically means whole Caps sync hung.

How reproducible:
(one step uncertain of particular reproducer)

Is this issue a regression from an earlier version:
? yes? (as an outcome of an upgrade, syncing can be hung)

Steps to Reproduce:

1. Have Caps 6.15 and invoke a bigger Caps sync

2. When the sync is in progress, upgrade the Capsule to 6.16. There is a chance pulp shutdown will leave a task in `state=running` (this can be tricky to reproduce, but it can happen).

3. After the upgrade, check if there is a "running" task that has:

state='running'
empty `worker_id` (no worker is assigned to the running task)
empty `unblocked_at` timestamp
the task was started before the upgrade (if unsure when the upgrade war run, then su - postgres -c "psql pulpcore -c \"SELECT applied FROM django_migrations WHERE name = '0117_task_unblocked_at';\"" will show the precise timestamp; tasks must be started prior this timestamp)

4. Try a new Capsule sync that will forcefully sync same repos.

Actual behavior:
4. gets stuck forever in RefreshRepos; pulp general update task will be waiting (on a resource acquired by the hung "running" task).

Expected behavior:
3. No such hung task exists.
4. Caps sync does not hung.

Business Impact / Additional info:
very specific bug, rare to happen, just one-time event during upgrade to 6.16. BUT with bad user experience, hard to troubleshoot or identify the problem, generic pulp bug (not related to Sat/Caps only).

I am not sure if or how to prevent this - should some migration step be added to cancel running tasks with empty `unblocked_at` timestamp? Or is "just" https://access.redhat.com/solutions/7104341 sufficient reaction?

clones

SAT-30625 Capsule sync or some Satellite's repo sync gets stuck forever since upgrade to 6.16

Testing

links to

KCS 7104341

Assignee:: Unassigned

Reporter:: Pavel Moravec

QA Contact:: Radek Mynar

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2025/02/05 7:31 PM

Updated:: 2025/02/20 4:21 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates