Loading...

Linking RHIVOS CVEs to...

Migration: Automation ...

Sync from "Extern...

XML

Word

Printable

Type: Epic
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: rhel-10.1
Component/s: ipa
Labels:
None

Epic Name:
IPA state control
Severity:
None
Keywords:

FutureFeature

AssignedTeam:
rhel-idm-ipa

Story Points:
None
Blocked:
False
Ready:
False
Blocked Reason:

Hide

None

Show
None
Product Documentation Required:
None
Products:

Red Hat Enterprise Linux
Sprint:
None

Preliminary Testing:
None
Test Coverage:
None

ProdDocsReview-CCS:
Unspecified
ProdDocsReview-Dev:
Unspecified
ProdDocsReview-QE:
Unspecified

Experience:
Architecture:

All

PX Impact Score:
SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Planning:
None

Description

As a system administrator, I want the FreeIPA deployment to be highly available and operationally robust by implementing intelligent health awareness and automated recovery behaviors. Specifically, the system should fulfill the following goals:

Be aware of the real-time health state of all IPA replicas
Continuously monitor and expose the operational status of each replica (healthy / degraded / unhealthy / maintenance / hidden).

Automatically remove unhealthy or maintenance replicas from client traffic pools
Withdraw replicas from DNS SRV pools (e.g., via dynamic DNS updates, health-check based removal from LDAP/Kerberos service records) when they enter a bad health state or are placed in maintenance mode.

Automatically reintroduce healed replicas into service
Re-add replicas to client-facing pools once they return to a healthy state (automatic re-healing when self-diagnosed issues are resolved).

Implement dependency-aware health checks
Tie a replica's reported health to the availability and correct functioning of its critical dependencies.
Example: A KDC should be marked unhealthy and removed from rotation/put down if its local LDAP backend is unavailable or responding incorrectly. This allows clients to automatically failover to healthy replicas instead of being stuck trying a broken instance.

Support extensible / pluggable health evaluation logic
Provide an architectural framework that makes it easy to add new health triggers and conditions in the future without major refactoring.
Examples of future extensions:

React to self-state changes (e.g., CA certificate list change, shared certificate change, replica list change, replication lag exceeding threshold)
Ideally integrate external signals (e.g., monitoring alerts, node-level metrics like memory leaking, network issues)
Possibility of custom scripts for site-specific checks

These capabilities should work together to achieve the following outcomes:

Minimize client-perceived downtime during replica failures or maintenance
Reduce manual intervention for common failure modes
Improve overall cluster resilience and observability

What SSTs and Layered Product teams should review this?

FreeIPA dev team.

Assignee:: Florence Renaud

Reporter:: Aleksandr Sharov

Developer:: Florence Renaud

QA Contact:: Sudhir Menon

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2026/02/25 3:26 PM

Updated:: 2026/02/25 3:50 PM

Details

Description

Description

What SSTs and Layered Product teams should review this?

Attachments

Easy Agile Planning Poker

Activity

People

Dates

Hide