Current testing doesn't do integration testing with RH products and how they affect basic setup and performance of cluster.
There should be a basic integration testing which at least test empty cluster with combination of RH products like Gitops, ACS, ACM, Nooba/Ceph.
PRODUCTS:
I propose to test
- ServiceMesh https://catalog.redhat.com/software/container-stacks/detail/5ec53e8c110f56bd24f2ddc4
- ODF https://catalog.redhat.com/software/container-stacks/detail/66b38f2e25325ec074ce06c3
- Data Foundation https://catalog.redhat.com/software/container-stacks/detail/60e6cf098d715a89c4e8625c
- ACM https://catalog.redhat.com/software/container-stacks/detail/5ec54aa3535cb70ab8c02996
- ACS https://catalog.redhat.com/software/container-stacks/detail/60eefc88ee05ae7c5b8f041c
- LSO https://catalog.redhat.com/software/container-stacks/detail/66993e0f4f284e980ee072d3
- GitOps https://catalog.redhat.com/software/container-stacks/detail/5fb288c70a12d20cbecc6056
- Pipelines https://catalog.redhat.com/software/containers/openshift-pipelines/pipelines-operator-bundle/6051bcfb7d4bcfc15f1793bf
- ClusterResourceOveride https://catalog.redhat.com/software/containers/openshift4/ose-clusterresourceoverride-rhel9/65280983da2318a6d719dc9b
- Quay https://catalog.redhat.com/software/containers/quay/quay-operator-rhel8/600e03b4dd19c7786c43ae4f?q=quay&architecture=amd64&image=674e2894037525ed243c30ff
- Openshift container storage https://catalog.redhat.com/software/container-stacks/detail/5f90879f9f80d4329cc6aa26
- Openshift Loki https://catalog.redhat.com/software/containers/openshift-logging/loki-rhel9-operator/64479927e1820602a81cdf13
as we can see these are most used operators on OCP (and from experience they can bring lot of performance problems or uncover bugs).
REASONING:
1. reoccurence of OLM bug
https://issues.redhat.com/browse/OCPBUGS-38751 https://issues.redhat.com/browse/OCPBUGS-17950
which will occur only with other products/operators installed. This bug was always pretty serious as it completely degraded performance of cluster without any actual load.
2. setting the baseline regarding CPU, RAM and ETCD storage for cluster running combination of our products.
3. issue when operators create too many secrets like https://access.redhat.com/solutions/7092264 and similar with ACM or ACS.
4. other possible bugs like
https://access.redhat.com/solutions/6955591
https://access.redhat.com/solutions/6531861
https://access.redhat.com/solutions/7036832
https://access.redhat.com/solutions/7030932
https://access.redhat.com/solutions/6980527
Each test should consist of:
- taking pre-test values so gathering must-gather, metrics as in https://access.redhat.com/solutions/5489721 and possibly wget https://raw.githubusercontent.com/peterducai/etcd-tools/refs/heads/main/etcd-analyzer.sh
chmod +x etcd-analyzer.sh
oc login
./etcd-analyzer.sh - install one product or combination of them (but Gitops, ACM and ACS are most used)
- collect same metrics and data as in point 1.
- duplicates
-
OCPBUGS-46581 [TEST] - test etcd performance and integration with other products like ACS, ACM, Gitops
-
- Closed
-
- is related to
-
RFE-5327 Adjust thresholds for `etcdHighFsyncDurations` and `etcdHighCommitDurations` alerts
-
- Under Review
-