-
Story
-
Resolution: Done
-
Normal
-
None
User Story
As a developer I want to add auto scaling to my application so that it can better respond to load changes.
While k8s offers the HPA (Horizontal Pod Autoscaler) it is very limited in practical usage and many applications want to be able to scale their pods on more factors than what HPA provides. Keda is an operator that solves this, allowing users to scale their applications/deployments based on a large number of different metrics and factors. Clowder has recently received support for Keda, allowing any Clowderised app to make use of certain scaling metrics easily and efficiently. This operator has been installed in ephemeral for users to test with, but will need to be installed on Stage/Prod for it to be fully usable.
An example of the Spec change to Clowder is as follows:
apiVersion: cloud.redhat.com/v1alpha1 kind: ClowdApp metadata: name: hello namespace: default spec: # The name of the ClowdEnvironment providing the services envName: env-default # The bulk of your App. This is where your running apps will live deployments: - name: app # Give details about your running pod podSpec: image: quay.io/psav/clowder-hello autoScaler: maxReplicaCount: 10 triggers: - type: cpu metadata: type: Utilization value: "50" # Creates a Service on port 8000 webServices: public: enabled: true metrics: enabled: true # Request kafka topics for your application here kafkaTopics: - replicas: 3 partitions: 64 topicName: topicOne # Creates a database if local mode, or uses RDS in production database: # Must specify both a name and a major postgres version name: jumpstart-db version: 12
In this example Clowder will configure and create the keda.ScaledObject resource, which the Keda operator will pick up and configure autoscaling as necessary.
Regarding mitigation, if there were ever an issue with Keda and we needed to remove it from use, we could do this is two ways:
- Remove the Keda declaration from the ClowdApps deployment, reverting it back to normal
- Create a "none" mode for the Keda provider in Clowder which would render it inert and would pass through the nominal minreplica values as it does today. This would mean a change to the deployment, but should not cause outages. This feature would be rather simple to add to Clowder.
Acceptance Criteria
- All the things that have to be done for the feature to be ready to
release.
Default Done Criteria
- All existing/affected SOPs have been updated.
- New SOPs have been written.
- Internal training has been developed and delivered.
- The feature has both unit and end to end tests passing in all test
pipelines and through upgrades. - If the feature requires QE involvement, QE has signed off.
- The feature exposes metrics necessary to manage it (VALET/RED).
- The feature has had a security review.* Contract impact assessment.
- Service Definition is updated if needed.* Documentation is complete.
- Product Manager signed off on staging/beta implementation.
Dates
Integration Testing:
Beta:
GA:
Current Status
GREEN | YELLOW | RED
GREEN = On track, minimal risk to target date.
YELLOW = Moderate risk to target date.
RED = High risk to target date, or blocked and need to highlight potential
risk to stakeholders.
References
Links to Gdocs, github, and any other relevant information about this epic.
- blocks
-
RHCLOUD-28890 Introduce auto-scaling for notifications-engine
- Backlog
- is blocked by
-
PODAUTO-20 Enable installing CMA on OSD/ROSA using the any( non openshift-*) namespace
- Closed