Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 6.18.0
Affects Version/s: 6.18.0
Component/s: Performance, Registration, RH Cloud
Labels:
- Satellite
- bug
- performance
- team-triaged
- triaged

Blocked:
False
Severity:
Important
AssignedTeam:
sat-proton

Release Note Type:
None
Release Note Text:
None
Release Note Status:
None

PX Impact Score:
SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Test Coverage:
None
Regression:
Yes

Market:

Description of problem:

We have observed changes in the duration of the "Incremental Registrations" and "Remote Execution (ReX)" tests. Additionally, there has been a significant shift in the failure rate.

How reproducible:

Always.

Is this issue a regression from an earlier version:

Yes.

Steps to Reproduce:

Try to run concurrent registration and ReX.

Actual behavior:

During routine checks on CPT, we noticed irregularities in the tests. Yesterday, we began an investigation and identified some issues by comparing two test streams — the current stream (Stream 116, running 2025-07-25) and the previous one (Stream 112, running 2025-07-14).

We noticed that job that normally takes 15 hours now takes 21 hours. Culprits were sections where we are doing concurrent registrations (6 -> 7 hours) and where we are measuring remote executions (3 -> 9 hours). Failure rate also regressed. Please see attached graphs or below linked spreadsheet.

The gap between runs was reported in slack and fixed in this PR.

Stream 112 run - https://jenkins-csb-perf-master.dno.corp.redhat.com/job/ContPerfStreamEL9/145/console

Stream 116 run - https://jenkins-csb-perf-master.dno.corp.redhat.com/job/ContPerfStreamEL9/153/console

Compared and checked all the available log files and created a sheet with detailed comparison here

These errors might be related:

$ grep ' [[EW]|' production-0300.log | grep -v -e 'You are trying to replace' -e 'ignoring associations organization_ids, location_ids audit definition for' -e 'No SSL cert with CN supplied - request from' -e 'Could not find a provider for' -e 'Received .* event from Candlepin. Handling of this event is no longer supported.' -e 'Polling failed, attempt' -e 'Process exited with an unknown status: pid .* exit 22' -e 'No such file or directory @ rb_file_s_rename'[...]
2025-07-29T03:22:23 [E|app|1a39a8ab] Fact insights_client::hostname could not be imported because of PG::InFailedSqlTransaction: ERROR: current transaction is aborted, commands ignored until end of transaction block
2025-07-29T03:22:23 [E|app|1a39a8ab] Fact insights_client::obfuscate_ipv4_enabled could not be imported because of PG::InFailedSqlTransaction: ERROR: current transaction is aborted, commands ignored until end of transaction block
[...]
2025-07-29T03:32:24 [E|app|01f41379] RestClient::Gone: Katello::Resources::Candlepin::Consumer: 410 Gone
{"displayMessage":"Unit 98626cc2-a851-4b93-a3ee-2a8e06de1175 has been deleted","requestUuid":"5f2dec16-9588-47d5-b2e1-6e4dc4b7d129","deletedId":"98626cc2-a851-4b93-a3ee-2a8e06de1175"}
(PUT /candlepin/consumers/98626cc2-a851-4b93-a3ee-2a8e06de1175)
2025-07-29T03:32:24 [E|app|01f41379] /usr/share/gems/gems/katello-4.18.0.pre.master/app/controllers/katello/api/rhsm/candlepin_proxies_controller.rb:227:in `block in consumer_destroy'