-
Bug
-
Resolution: Done
-
Major
-
1.6.0
-
1
-
False
-
-
False
-
-
Bug Fix
-
Done
-
-
-
RHDH Install 3276
Description of problem:
If we create an Horizontal Pod Autoscaler (HPA) resource to scale the RHDH pods based on application usage, we can notice that the RHDH Operator would always revert the number of replicas down to 1 (default number of replicas).
This issue has been reported in RHIDP-4089 as well.
Prerequisites (if any, like setup, operators/versions):
- RHDH Operator 1.6.1 (or from the main branch of the rhdh-operator repo)
- Tested on a ROSA - OCP 4.18 cluster and also on a local Kind cluster (with a metrics server installed):
Steps to Reproduce
- Install the RHDH Operator 1.6.1 (also tested with `make deploy` from the rhdh-operator repo main branch)
- Create a very simple Backstage CR, like so:
cat << EOF | oc apply -f - apiVersion: rhdh.redhat.com/v1alpha3 kind: Backstage metadata: name: bs1 EOF
- Wait until the RHDH pods are fully up and running
- Create an HPA resource tied to the RHDH Deployment, either declaratively or imperatively with a command like this:
oc autoscale deployment backstage-bs1 \ --cpu-percent=50 \ --min=1 \ --max=3
- Check that there is an HPA resource created and that CPU usage is being tracked:
$ oc get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE backstage-bs1 Deployment/backstage-bs1 cpu: 5%/50% 1 3 1 19s
- Generate some high CPU load on the RHDH pod with this command as an example:
oc exec -it deploy/backstage-bs1 -- /bin/sh -c "openssl speed -multi $(nproc --all)"
- In a separate tab, watch the HPA and notice the CPU usage increasing:
$ oc get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE backstage-bs1 Deployment/backstage-bs1 cpu: 399%/50% 1 3 1 4m36s
- Describe the HPA and notice that it tried to scale up based on CPU usage:
$ oc describe hpa backstage-bs1 Name: backstage-bs1 Namespace: my-ns Labels: <none> Annotations: <none> CreationTimestamp: Mon, 16 Jun 2025 17:56:29 +0200 Reference: Deployment/backstage-bs1 Metrics: ( current / target ) resource cpu on pods (as a percentage of request): 399% (998m) / 50% Min replicas: 1 Max replicas: 3 Deployment pods: 1 current / 3 desired Conditions: Type Status Reason Message ---- ------ ------ ------- AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3 ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request) ScalingLimited True TooManyReplicas the desired replica count is more than the maximum replica count Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulRescale 13s (x12 over 2m58s) horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
- Check the RHDH Deployment and notice that it was scaled up (via the HPA), then scaled down (by the Operator):
$ oc describe deployment backstage-bs1 [...] Conditions: Type Status Reason ---- ------ ------ Progressing True NewReplicaSetAvailable Available True MinimumReplicasAvailable OldReplicaSets: <none> NewReplicaSet: backstage-bs1-65999bf47b (1/1 replicas created) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ScalingReplicaSet 10m deployment-controller Scaled up replica set backstage-bs1-65999bf47b to 1 Normal ScalingReplicaSet 3s (x10 over 2m18s) deployment-controller Scaled up replica set backstage-bs1-65999bf47b to 3 from 1 Normal ScalingReplicaSet 2s (x10 over 2m18s) deployment-controller Scaled down replica set backstage-bs1-65999bf47b to 1 from 3
- The Operator logs confirm the behavior seen here:
[...] 2025-06-16T16:04:45Z DEBUG enqueuing reconcile on Deployment change {"Deployment": "backstage-bs1", "namespace: ": "my-ns"} 2025-06-16T16:04:45Z DEBUG apply object {"controller": "backstage", "controllerGroup": "rhdh.redhat.com", "controllerKind": "Backstage", "Backstage": {"name":"bs1","namespace":"my-ns" }, "namespace": "my-ns", "name": "bs1", "reconcileID": "7a84983a-566b-4e76-8b0c-9c4d4fc9a91e", "/v1, Kind=ConfigMap": "backstage-appconfig-bs1"} [...]
Actual results:
Deployment automatically scaled by the HPA based on application usage, but reverted back by the Operator.
Expected results:
Operator should respect the autoscaling constraints defined by the HPA attached to the RHDH Deployment.
This would help users adapting their RHDH instance to their usage - see RHIDP-4089
Reproducibility (Always/Intermittent/Only Once):
Always
Build Details:
Additional info (Such as Logs, Screenshots, etc):
Operator Logs attached.
- relates to
-
RHIDP-7818 [Operator] Add support for HPA (Horizontal Pod Autoscaler)
-
- Closed
-