Loading...

XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Critical
Fix Version/s: None
Affects Version/s: None
Component/s: Networking
Labels:
- FPC:TODO-Close-ALL-Epics
- FPC:TODO-DevComplete-Delivery-Epics

Activity Type:
Product / Portfolio Work
Parent Link:
None
Hierarchy Progress Bar:

0% To Do, 100% In Progress, 0% Done
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Size:
None

Target Version:

openshift-4.22
Release Blocker:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Priority Data:
None
PX Impact Score:
PX Technical Impact:
None
PX Impact Range:
None
PX Scheduling Request:
None
PX Technical Impact Notes:
None

Intelligence Requested:
Market:

Feature Overview (aka. Goal Summary)

This feature introduces support for configuring the maximum number of router connections exposed to Prometheus monitoring.

By making router max connection limits configurable and observable, cluster administrators gain improved visibility into router capacity, saturation risk, and scaling behavior under varying traffic loads.

Goals (aka. expected user outcomes)

Primarily to avoid hitting a hard-configured "ceiling" during times of extreme traffic, but also:

Enable administrators to configure a router max connections value that is surfaced through Prometheus metrics.
Improve observability into router capacity utilization and connection pressure.
Support proactive alerting on router saturation risks before service degradation occurs.
Allow alignment of router monitoring data with custom deployment sizes, traffic profiles, and SLAs.
Maintain backward compatibility for clusters that rely on existing default behavior.

Requirements (aka. Acceptance Criteria):

Functional Requirements

Provide a configurable parameter to define the maximum number of router connections.
Expose this value via Prometheus-compatible metrics.
Ensure metrics clearly distinguish between:
- Configured maximum connections
- Current/active connections
Support dynamic updates through supported configuration mechanisms (e.g., operator-managed configuration).

Non-Functional Requirements

No significant performance impact on router dataplane operations.
Metrics must follow existing Prometheus naming and labeling conventions.
Defaults must preserve existing behavior when no configuration is provided.
Configuration changes should be observable without requiring cluster downtime.

Anyone reviewing this Feature needs to know which deployment configurations that the Feature will apply to (or not) once it's been completed. Describe specific needs (or indicate N/A) for each of the following deployment scenarios. For specific configurations that are out-of-scope for a given release, ensure you provide the OCPSTRAT (for the future to be supported configuration) as well.

Deployment considerations	List applicable specific needs (N/A = not applicable)
Self-managed, managed, or both
Classic (standalone cluster)
Hosted control planes
Multi node, Compact (three node), or Single node (SNO), or all
Connected / Restricted Network
Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x)
Operator compatibility
Backport needed (list applicable versions)
UI need (e.g. OpenShift Console, dynamic plugin, OCM)
Other (please specify)

Use Cases:

Capacity Planning: Operators track router connection utilization relative to configured limits to determine when to scale routers or adjust traffic distribution.
Alerting: Platform teams configure alerts when active connections approach a configurable percentage of the maximum.
Multi-Tenant Clusters: Administrators tune router connection limits to match tenant traffic expectations and avoid noisy-neighbor scenarios.
Performance Troubleshooting: SREs correlate connection pressure with latency, error rates, or dropped connections during incident analysis.

Out of Scope

Background

Routers play a critical role in handling ingress traffic and maintaining client connections. While active connection metrics are commonly available, the lack of a configurable and observable maximum connection reference point limits the effectiveness of monitoring and alerting. Static or implicit limits can cause headroom issues, especially in clusters during times of extreme traffic (e.g. holiday sales) and with diverse workloads or custom router deployments. Providing explicit, configurable max connection values improves clarity and operational confidence.

Customer Considerations

Backward Compatibility: Existing clusters should continue to function without requiring configuration changes.
Simplicity: Configuration should be easy to understand and manage through existing tooling.
Documentation: Clear guidance must be provided on:
- Recommended values
- How to interpret metrics
- How to build alerts based on the new data
Safety: Misconfiguration should be mitigated through validation or sensible defaults to prevent misleading metrics.

Documentation Considerations

Interoperability Considerations

Fully compatible with existing Prometheus deployments and dashboards.
Integrates with standard alerting frameworks (e.g., Alertmanager).
Works alongside existing router metrics without breaking dashboards or queries.
Supports interoperability with downstream observability tools that consume Prometheus metrics (e.g., Grafana).
Aligns with operator-managed configuration models and does not require custom patches or sidecars.

links to

openshift/cluster-ingress-operator#1361: [WIP] NE-2418: Add e2e test for haproxy_max_connections metric

openshift/router#728: [WIP] NE-2418: Add haproxy_max_connections metric

Assignee:: Marc Curry

Reporter:: Marc Curry

Architect:: Andrey Lebedev

QA Contact:: Hongan Li

Doc Contact:: Ashley Hardin

Product Operations Engineering Contact:: Andrey Lebedev

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2025/12/18 10:25 PM

Updated:: 2026/02/27 10:55 AM

Details

Description

Feature Overview (aka. Goal Summary)

Goals (aka. expected user outcomes)

Requirements (aka. Acceptance Criteria):

Functional Requirements

Non-Functional Requirements

Use Cases:

Out of Scope

Background

Customer Considerations

Documentation Considerations

Interoperability Considerations

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates