Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Critical
Fix Version/s: rhos-18.0.6
Affects Version/s: rhos-18.0 FR 1 (Nov 2024)
Component/s: infra-operator
Labels:
- triaged

Story Points:
3
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Docs Approval:
?
Fixed in Build:
keystone-operator-container-1.0.7-6
Regression:
None
Intelligence Requested:
Market:
Errata Link:
https://errata.engineering.redhat.com/advisory/146727
Target Version:

rhos-18.0 FR 2 (Mar 2025)

Sprint:
PIDONE 18.0.4, PIDONE 18.0.5, PIDONE 18.0.6, PIDONE 18.0.7
sprint_count:
4
Severity:
Critical

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Whenever one of the mecached pods disappears (because of a rolling restart during a minor update or as result of a failure) APIs take a long time to detect that the pod went away and keep trying to reconnect.

From a quick round of tests we saw that the API downtime was ~150s or so.

Looking at the services (mainly keystone and nova-api) config files we saw that there are a couple of parameters that could be useful:

enable_retry_client=true
retry_attempts=X
retry_delay=Y

We tested with retry_attempts=2 and retry_delay=0 and the APIs recovered much faster.

links to

openstack-k8s-operators/cinder-operator#472: Configure keystonemiddleware to deal with memcached pods failures

openstack-k8s-operators/glance-operator#663: Configure keystonemiddleware to deal with memcached pods failures

openstack-k8s-operators/heat-operator#479: Configure dogpile.cache to deal with memcached pods failures

openstack-k8s-operators/keystone-operator#511: Configure dogpile.cache to deal with memcached pods failures

openstack-k8s-operators/manila-operator#367: Configure keystonemiddleware to deal with memcached pods failures

openstack-k8s-operators/neutron-operator#447: Configure keystonemiddleware/oslo to deal with memcached pods failures

openstack-k8s-operators/nova-operator#904: Configure dogpile.cache to deal with memcached pods failures

RHBA-2025:146727 Release of containers for RHOSO OpenStack Podified operator

(3 links to)

Assignee:: Luca Miccini

Reporter:: Luca Miccini

Team:: rhos-dfg-pidone

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2024/11/25 3:54 PM

Updated:: 2025/03/20 8:18 AM

Resolved:: 2025/03/20 8:18 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty