-
Epic
-
Resolution: Unresolved
-
Critical
-
None
-
None
Feature Overview
This feature aims to implement server-side pagination for the Red Hat Advanced Cluster Management (ACM) Search API. Currently, large result sets can be inefficient to retrieve and display. Adding pagination will allow for more efficient data retrieval, better performance, and an improved user experience when dealing with extensive search results.
Goals
This Section: Provide high-level goal statement, providing user context
and expected user outcome(s) for this feature
- Improve the performance of Search API queries for large datasets by reducing the amount of data transferred in a single request.
- Provide a more efficient and scalable way for clients to consume large search results.
- Enhance the user experience by enabling faster loading of search results and better navigation through them.
- Complement the future real-time data streaming by providing an efficient way to fetch initial large datasets or "snapshot" views.
Requirements
This Section: A list of specific needs or objectives that a Feature must
deliver to satisfy the Feature.. Some requirements will be flagged as MVP.
If an MVP gets shifted, the feature shifts. If a non MVP requirement slips,
it does not shift the feature.
Requirement | Notes | isMvp? |
---|---|---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. |
YES |
Release Technical Enablement | Provide necessary release enablement details and documents. |
YES |
The Search API (GraphQL) MUST support standard pagination parameters (e.g., first/last and after/before for cursor-based, or limit/offset for offset-based). | YES | |
The pagination mechanism MUST be performant and not degrade database performance for large offsets or deep paging. | YES | |
The pagination implementation MUST ensure consistent results when navigating through pages, even if underlying data is changing. | YES | |
The API documentation MUST clearly define the pagination parameters and their usage. | YES | |
The Search API should provide metadata with paginated results (e.g., total count, hasNextPage, hasPreviousPage). | YES |
(Optional) Use Cases
This Section:
- Main success scenarios - high-level user stories
- As an ACM admin, I want to quickly view the first page of thousands of virtual machines without waiting for all of them to load.
- As an ACM admin, I want to efficiently navigate through large lists of virtual machines using "next page" and "previous page" controls.
- As a developer, I want to programmatically retrieve subsets of search results to avoid memory exhaustion when processing extensive data.
- Alternate flow/scenarios - high-level user stories
- If a user tries to access a page that is beyond the available results, the API should return an appropriate empty result or error.
- If the underlying data changes significantly between page requests (e.g., items are added or removed), the pagination should still provide a reasonable and consistent view (e.g., by using stable cursors).
Questions to answer
- ...
Out of Scope
- Client-side pagination implementation (this feature focuses solely on the API backend).
- Complex sorting capabilities beyond what is required to support stable pagination.
Background, and strategic fit
As ACM environments grow, the volume of managed resources and related data in Search also increases. Without server-side pagination, fetching large result sets can be slow, consume excessive memory, and lead to poor responsiveness. Implementing pagination is a fundamental improvement for any API dealing with potentially large datasets. It directly addresses performance and scalability concerns, making the Search API more robust and user-friendly. This feature is complementary to the "Real-Time Data Streaming" initiative; while streaming provides immediate updates for changes, pagination provides an efficient way to retrieve a snapshot of the entire dataset or a large portion of it, crucial for initial loads or when Browse historical data where streaming isn't applicable. Together, they offer a comprehensive solution for data access.
Assumptions
- The existing Search API (GraphQL) framework supports the necessary extensions to implement pagination directives or arguments.
- The underlying PostgreSQL database is capable of efficiently executing paginated queries, especially for cursor-based approaches that avoid costly OFFSET operations.
- The Search indexer and API are designed to handle concurrent paginated requests efficiently.
Customer Considerations
- Customers will experience significantly faster load times for search results, especially in large environments.
- The user interface built on top of the API will become more responsive and easier to navigate.
- Reduced network bandwidth usage and API processing time will lead to a more efficient overall experience.
Documentation Considerations
Questions to be addressed:
- What educational or reference material (docs) is required to support this
product feature? For users/admins? Other functions (security officers, etc)?- Yes, primarily for developers consuming the API, but also potentially for advanced administrators.
- Does this feature have a doc impact?
- Yes
- New Content, Updates to existing content, Release Note, or No Doc Impact
- If unsure and no Technical Writer is available, please contact Content
Strategy. - What concepts do customers need to understand to be successful in
[action]? - How do we expect customers will use the feature? For what purpose(s)?
- To build UIs that efficiently display large lists, for programmatic data extraction, and for integration with other systems.
- What reference material might a customer want/need to complete [action]?
- Is there source material that can be used as reference for the Technical
Writer in writing the content? If yes, please link if available.- Yes, API specification documents, technical design documents for the pagination implementation.
- What is the doc impact (New Content, Updates to existing content, or
Release Note)?- New Content:
- Detailed API documentation specifically for pagination parameters and best practices.
- Examples of paginated GraphQL queries.
- Updates to existing content:
- Updates to the general Search API documentation to incorporate pagination as a core feature.
- Release Note: A release note detailing the availability of server-side pagination for the Search API.
- New Content:
Technical Considerations for Server-Side Pagination
- Choice of Pagination Strategy:
- Offset-based (limit/offset): Simple to implement but can become very inefficient for large offset values in PostgreSQL as the database still has to process all rows up to the offset before returning the limit. It can also lead to inconsistent results if data is added or deleted between page requests, as the "page" shifts.
- Cursor-based (first/last, after/before): More complex to implement but generally more performant and stable for large datasets. It relies on a unique, indexed column (the "cursor") to mark the last item of a page, allowing the next page query to start directly from that point. This is often preferred for GraphQL APIs.
- Database Indexing: Efficient pagination, especially cursor-based, heavily relies on appropriate indexing of the columns used for sorting and as cursors in PostgreSQL. Missing or inefficient indexes can severely degrade performance.
- Consistency vs. Real-Time: While pagination provides a "snapshot" view, for frequently changing data, a paginated result set might become stale as the user navigates. The choice of pagination strategy and how cursors are generated can influence the degree of consistency. Cursor-based pagination generally offers better consistency in dynamic datasets than offset-based.
- Sorting Requirements: Pagination often goes hand-in-hand with sorting. The API needs to support defining a stable sort order for the results to ensure that pagination works predictably. The cursor usually incorporates the sort order values.
- Total Count: Providing a "total count" of results alongside paginated data can be very expensive in PostgreSQL for large tables without appropriate optimizations (e.g., using estimated counts or separate, cached count queries). This needs to be considered for performance.
- Deep Paging Attacks: Without proper limits, allowing clients to request arbitrarily deep pages (e.g., very high offset values) can be a denial-of-service vector. Limits on page size and maximum depth should be considered.
- Integration with GraphQL: GraphQL has a standard way of handling pagination through the "Relay Cursor Connections Specification," which encourages cursor-based pagination and provides a consistent structure for returning paginated data and metadata. Adhering to this standard can simplify client-side consumption.
- Caching: How will paginated results interact with any API-level or database-level caching? Invalidation strategies for cached pages, especially with data changes, need to be thought through.
- Relation to Data Streaming:
- Pagination provides the initial load or "snapshot" of data, especially for large datasets that wouldn't be feasible to stream all at once.
- Streaming can then be used to push updates or deltas to that initial paginated view, keeping it fresh without needing to re-fetch entire pages
- For instance, a user might load the first 100 clusters via pagination, and then any changes to those 100 clusters (or new clusters appearing on subsequent pages) could be streamed.