-
Epic
-
Resolution: Done
-
Blocker
-
None
-
Search v2 Odyssey scale requirement for GA
-
False
-
None
-
True
-
eng-lead
-
Green
-
In Progress
-
ACM-1579 - Discover and index kubernetes resources automatically
-
-
0% To Do, 0% In Progress, 100% Done
Epic Goal
- SUMMARY: This epic defines the minimum requirements for install, scalability, and RBAC for search v2 Odyssey to reach GA quality.
- Deliver ACM Search capability out of the box, with the ability to collect resources from 'large' clusters, using the default search collection config.
- Large clusters: 125 nodes, 75 pods running on the node, 25 users, 250 projects, 10000 secrets, 1000 CRDs and resources that need to be collected and ingested into the ACM Hub.
- Expect about 20 of these clusters to be search/managed successfully. Successfully means:
- < 2 seconds delay to get our out of the box saved search query result from API,
- < 1 second delay on search bar type ahead,
- < 2 seconds delay for consumers of search (Overview, Applications, etc)
- This is a GA epic. The focus is on quality and Done.
- If RBAC is not ready, we ship as Tech Preview
- If scale is not ready, we don't ship it.
Why is this important?
- Search provides the global view, across multiple clusters, without the need to specifically define a cluster context.
- Customers expect the same level of Search capability in ACM 2.6 that is already available in prior versions of ACM.
- Customers want their Search capability to scale well and be available to them without stability concerns.
Scenarios
- For new ACM 2.6 environments,
- A user can 'opt in' to deploy Odyssey with postgres backend as part of the MCH install. This will also turn off v1 Search.
- A user can also, after the MCH deploy, on day2, enable Odyssey which will turn off v1 Search.
- v2 Odyssey must adhere to the MCH config for basic/high HA
- For existing ACM 2.5.z environments,
- If the environment is Production, you probably do not deploy v2 Odyssey since it is a tech preview and this dead-ends your cluster.
- If the environment is Dev/Stage, and they do not want to pave it over (eg, they have 'stuffs' they want to keep alive in the hub)
- the upgrade completes as expected with the v1 Search as is,
- the user can use DOC and run through simple steps to remove v1 Search and replace with v2 Search Odyssey.
- We are not talking about data migration - Search will repopulate the data/cache from new managed cluster collection.
- As a V2 Odyssey search developer (eventually ACM admin) I can measure the performance using a metric (think: measuring latency, throughput)
- As an ACM Admin, I should be able to receive an alert about the resource consumption (think: disk usage) of my V2 Odyssey.
- Search V2 Odyssey RBAC should be compatible with Search V1: Cluster level RBAC.
Acceptance Criteria
- CI - MUST be running successfully with tests automated
- Release Technical Enablement - Provide necessary release enablement details and documents.
- Includes scale testing
- Must make use of configurable Search collection
- We need to document how to enable v2 Odyssey as part of the MulticlusterHub install
- We need MCH CRD to enable the deployment of v2 Odyssey, and in doing so, it will disable the v1 Search
- Disruption of Search services should be minimal as new code propagates across managed clusters.
- Run a promQL query to observe the health of V2 Odyssey metrics
- Under scale testing conditions, there should not be shadow/cache information that is no longer relevant.
Dependencies (internal and external)
- postgres
Previous Work (Optional):
Open questions:
- Do we have a good benchmark for the 'Large' cluster?
- Do we need a deprecation announcement related to RedisGraph?
- We will need to announce that the search collector configuration feature will introduce a reduction of what resources get collected & displayed at the hub.
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
- DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
- DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Downstream documentation merged: <link to meaningful PR>
- links to