Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: 6.13.0, 6.12.z
Affects Version/s: 6.10.4
Component/s: Tasks Plugin
Labels:
- Triaged
- triaged

Blocked:
False
External issue URL:
https://projects.theforeman.org/issues/34800
GitHub Issue:
https://github.com/theforeman/foreman-tasks/pull/681
Bugzilla Bug:
RHBZ: 2073847
Severity:
Moderate

Release Note Type:
None
Release Note Text:
None
Release Note Status:
None

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Test Coverage:
None

Description of problem:
When postgres service is restarted (i.e. as part of all services restart or alone) when dynflow is about to complete a task, then the task can end up hung in a few invalid situations forever.

"Invalid situation" means e.g.:

foreman sees the task as stopped/pending while dynflow sees it as stopped/succes
or foreman sees the task as running/pending while dynflow sees it as stopped/success

"Forever" means there is no user action to fix the status, like:

services restart doesnt help
force unlock can move foreman task from running/pending to stopped/pending, but nothing else

Also, until force unlock is done, such stuck task can have acquired its object(s) lock(s).

Version-Release number of selected component (if applicable):
Sat6.10.4

How reproducible:
100% within a few attempts

Steps to Reproduce:
One particular reproducer is to Destroy a CV and just at the end, restart postgres service. It can be VERY tricky to guess the "at the end", so the script below checks for number of completed pulp tasks - for a CV with one repo, the ContentView::Destroy task triggers one pulp task. So whenever the script detects as many new completed pulp tasks as the number of being-destroyed CVs is, the script restarts postgres.

Script itself:

-------8<--------------8<--------------8<-------
CONCUR=${1:-5}
REPOIDS=${2:-51}
hmr="hammer shell"

prepare_cv_to_delete() {
CVID=$1
( echo "content-view create --organization-id=1 --name cv_zoos_${CVID} --repository-ids ${REPOIDS}"
echo "content-view publish --organization-id=1 --name cv_zoos_${CVID}"
echo "content-view remove-from-environment --organization-id=1 --name=cv_zoos_${CVID} --lifecycle-environment-id=1"
echo "content-view version delete --content-view=cv_zoos_${CVID} --version 1.0 --organization-id 1"
) | $hmr
}

for i in $(seq 1 $CONCUR); do
prepare_cv_to_delete $i &
done

echo "waiting for CVs create+almost-delete"
time wait

for i in $(seq 1 $CONCUR); do
hammer content-view delete --name=cv_zoos_${i} --organization-id 1 &
done

echo "$(date): waiting for CVs delete"
tasks=$(su - postgres -c "psql pulpcore -c \"copy (select count from core_task) to stdout;\"")
echo "$(date): waiting for CVs delete, pulp tasks=${tasks}"
expected=$((tasks+CONCUR))
tasks=0
while [ $tasks -lt $expected ]; do
tasks=$(su - postgres -c "psql pulpcore -c \"copy (select count from core_task) to stdout;\"")
sleep 0.5
done
#su - postgres -c "psql pulpcore -c \"select count from core_task;\""
echo "$(date): restarting postgres as having tasks=${tasks}"
systemctl restart rh-postgresql12-postgresql.service
date
time wait
su - postgres -c "psql pulpcore -c \"select count from core_task;\""
-------8<--------------8<--------------8<-------

Usage:

./create_delete_cv_restart_postgres.sh 5 REPOID

where REPOID is an id of a small repo

Actual results:
Random tasks tuck forever, optionally with acquired locks.

As an example, see attached task export.

Expected results:
No such stuck tasks forever. Tasks should be recoverable by a restart or manual (Skip&)Resume.

Additional info:

external trackers

Foreman Issue Tracker 34800

Red Hat Errata Tool 110313

Red Hat Issue Tracker SAT-9818

Red Hat Product Errata RHSA-2023:2097

[QE]Restarting postgres just before task finish causes discrepancy between foreman and dynflow task status - forever

Closed

Peter Ondrejka

Assignee:: Adam Ruzicka

Reporter:: Adam Ruzicka

QA Contact:: Peter Ondrejka

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2022/04/22 12:23 PM

Updated:: 2024/08/16 12:55 PM

Resolved:: 2023/03/06 5:08 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Sub-Tasks

Activity

People

Dates