Loading...

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 6.16.5
Component/s: Performance, Pulp
Labels:
- triaged

Blocked:
False
AssignedTeam:
sat-artemis

Release Note Type:
None
Release Note Text:
None
Release Note Status:
None

PX Impact Score:
PX Review Complete:
SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Test Coverage:
None
Regression:
Yes

Market:

Description of problem:
There is an evident performance degradation when a client registered to Satellite 6.16 (or 6.17) runs "dnf reposync" or a similar operation that fetches thousands of packages.

The issue happens when serving locally stored packages (i.e. even with Immediate download policy).

The regression happens between 6.15 and 6.16, while 6.17 performance is similar to 6.16.

Dissecting where the delay comes from, I measured response times of both httpd and of `stream_content` method https://github.com/pulp/pulpcore/blob/main/pulpcore/content/handler.py#L279 as described below.

When I did a dirty copy of the `handler.py` from 6.15 to 6.16 and 6.17, performance got improved a lot, almost to the 6.15 values. This proves the changes in handler.py cause the regression.

How reproducible:
100% n proper testing

Is this issue a regression from an earlier version:
yes, between 6.15 and 6.16

Steps to Reproduce:

0) use three Satellites, 6.15 (on RHEL8), 6.16 (on RHEL9) and 6.17 (on RHEL9), and three identical clients (RHEL9). Satellites used `large` tuning, otherwise default install.

1) In `/etc/httpd/conf/httpd.conf`, append to `LogFormat` values `%D`, to print response time in microseconds (and restart httpd service)

2) In `/usr/lib/python3.11/site-packages/pulpcore/content/handler.py`, apply patch:

--- /usr/lib/python3.11/site-packages/pulpcore/content/handler.py.orig	2025-09-08 08:48:59.369427549 +0200
+++ /usr/lib/python3.11/site-packages/pulpcore/content/handler.py.617debugs	2025-09-09 08:31:50.713746381 +0200
@@ -1,3 +1,4 @@
+import time
 import asyncio
 import logging
 from multidict import CIMultiDict
@@ -286,8 +287,12 @@ class Handler:
             [aiohttp.web.StreamResponse][] or [aiohttp.web.FileResponse][]: The response
                 back to the client.
         """
+        start_time = time.time()
         path = request.match_info["path"]
-        return await self._match_and_stream(path, request)
+        ret = await self._match_and_stream(path, request)
+        end_time = time.time()
+        log.info(f"PavelM: stream_content of {path=} in time {end_time-start_time}")
+        return ret
 
     @staticmethod
     def _base_paths(path):

to measure response time of pulp (and restart pulpcore-content service).

3) To ensure `redis` does not cache anything, stop the service (it should not cache much packages since cnd reposync fetches whole repo content which is far beyond of redis cache, but dont have our tests biased by redis).

4) Run either `dnf reposync`, or the same in sequential mode (`echo max_parallel_downloads=1 >> /etc/dnf/dnf.conf`), or iterate `curl` to fetch each and every package of a repo (generate the list of packages from dnf metadata and get SSL certs from redhat.repo). In either case (dnf / curl), results were similar.

5) Check these three values:

sum of pulp response times per changes in handler.py
sum of httpd response times per httpd access logs
runtime of the client (dnf or bash script running curl commands)

Actual behavior:
TL;DR: 6.16 is worse by 20ish percents than 6.15, while 6.17 is yet bit slower by a few percents - depending what metric you focus on. Copying handler.py from pulp in 6.15 to 6.16 and 6.17, most of the difference is gone.

I run each scenario 5 times and put just median value each time. In each scenario, I calculated sum of resposnse times in seconds. RHEL9 BaseOS repo was used here, while similar outcome was seen for RHEL8 BaseOS repo (and almost surely exist for any other repo).

Scenario	6.15 pulp	6.15 httpd	6.15 client	6.16 pulp	6.16 httpd	6.16 client	6.17 pulp	6.17 httpd	6.17 client
default scenario, real values	146.71	562.90	837.00	212.62	641.56	940.00	222.87	673.07	962.00
default scenario, values relative to 6.15	1.00	1.00	1.00	1.45	1.14	1.12	1.52	1.20	1.15
handler.py from 6.15, real values	145.08	558.30	833.00	159.28	580.04	838.00	160.29	565.40	846.00
handler.py from 6.15, relative values	1.00	1.00	1.00	1.10	1.04	1.01	1.10	1.01	1.02

Expected behavior:
No such big regression, or some explanation what newly implemented feature in pulp justifies the perf.degradation (as usually new features do).

Business Impact / Additional info:
The `dnf reposync` use case is not very common. In the most often seen use cases like "install those five packages to 100 clients", there is no performance regression. Just one possible widely used use case can be harmed: Sync of a Capsule with Immediate download policy (quite common among customers). Though I havent tested this at all, just I speculate here.

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates