As a consumer of ROSA STS and future Hypershift (STS by default), I want to use the same operators that are available to OSD or OCP, but with STS enabled for the operators.
As a service provider offering ROSA STS, I want to offer all the operators available to customers that facilitate key features such as cloudwatch forwarding, OADP, and others. Basic features of the platform are provided via Operators that currently do not yet have STS modes built-in, which leaves the platform in a less-than-ideal functional state.
Approach options may still need discussion. Perhaps operators register a listing of roles/policies required to run in AWS with STS, and we retrieve those, and set them up for users with the ROSA CLI (and also via the OCM UI), as well as on the cluster, and the operator can then hook into the provided roles/policies.
This involves preparing IAM/STS roles for customers that have the permissions for the operator they wish to use.
BU Proposed approach:
The workflow of providing secrets for operators is made available and mostly the same for both Red Hat Operators and Community Operators.
A user experience would be...
- rosa CLI:
- OCP operators all provide a necessary metadata package (consists of the role, policy, secret structure, etc) that is stored in OCM/CS
- The ROSA CLI command offers to prepare the role for the operator (auto or manual mode) and it's setup as needed in the cluster with the prepared/provided role
- For community operators, allow a customer to import the similar metadata package (role, policy, secret structure, etc) with something like
- We provide basic guidance documentation for customers and community operators wishing to use foreign operators with ROSA STS, using this system.
- Perhaps also provide a command that generates template metadata files that can be filled, and then ingested by the CLI or UI.
For OCP/ Red Hat operators, they would not pass Cloud-services minor version validation without providing the necessary metadata packages for their operators.
In a future state, a pipeline for those would be formalised to facilitate operators working in the ROSA/STS landscape.
If an operator's metadata package (roles, policies, permissions, secrets, etc) falls out of date or does not work as intended, then respective operator teams must be able to support bringing these artefacts up to date so as to not leave the burden of these permissions in the hands of the Service Delivery org.
- The above proposed workflow is enabled and functions for all operators that have provided metadata at the time of this feature's availability (namely pointing for EFS operator and OADP at first)
- The following Operators have been identified as crucial to the platform and should be workable with this workflow to accept completion of this epic:
- EFS (Operator-hub, OCP one)
- Logging Operator
- ALB controller operator (https://github.com/openshift/aws-load-balancer-operator/blob/12bcb4a1b4c473453aef78a82621a1eb407b6c13/docs/install.md)
- AWS EFA Operator (https://issues.redhat.com/browse/ECOPROJECT-449)
- AWS Controllers for Kubernetes
- Documentation for this capability/workflow is complete.
- one or more popular community operators have been tested to function with this workflow (custom role/policy ingested by the tool to set up the operator)
- All existing/affected SOPs have been updated.
- New SOPs have been written.
- Internal training has been developed and delivered.
- The feature has both unit and end to end tests passing in all test
pipelines and through upgrades.
- If the feature requires QE involvement, QE has signed off.
- The feature exposes metrics necessary to manage it (VALET/RED).
- The feature has had a security review.* Contract impact assessment.
- Service Definition is updated if needed.* Documentation is complete.
- Product Manager signed off on staging/beta implementation.
Note: This epic is a follow-on after SDE-1428.
GREEN | YELLOW | RED
GREEN = On track, minimal risk to target date.
YELLOW = Moderate risk to target date.
RED = High risk to target date, or blocked and need to highlight potential
risk to stakeholders.
Links to Gdocs, github, and any other relevant information about this epic.