Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: rhos-18.0 FR 2 (Mar 2025)
Affects Version/s: rhos-18.0.4
Component/s: octavia-operator
Labels:
None

Story Points:
1
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Docs Approval:
?
Fixed in Build:
octavia-operator-container-1.0.7-8
Regression:
None
Release Note Text:

Hide
.Fixed stability issue with Load-balancing service health manager in DCN mode
Before this update, when you ran Load-balancing service (octavia) health manager pods in DCN mode, pods were randomly restarted by the operator.
With this update, the random restarts do not occur.

Show
.Fixed stability issue with Load-balancing service health manager in DCN mode Before this update, when you ran Load-balancing service (octavia) health manager pods in DCN mode, pods were randomly restarted by the operator. With this update, the random restarts do not occur.
Release Note Type:
Bug Fix
Release Note Status:
Done
Intelligence Requested:
Market:
Errata Link:
https://errata.engineering.redhat.com/advisory/146727

Sprint:
VANS-010
sprint_count:
1
Severity:
Important

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

To Reproduce Steps to reproduce the behavior:

Deploy rhoso with octavia and enable multi-AZ management network:

spec:
  octavia:
    template:
      lbMgmtNetwork:
        availabilityZoneCIDRs:
          az1: 172.34.0.0/16
          az2: 172.44.0.0/16
        createDefaultLbMgmtNetwork: false

The CIDR of the AZs are passed to the octavia-healthmanager pods via env vars:

$ oc get daemonsets.apps octavia-healthmanager  -o yaml | grep -A1 MGMT_CIDR
        - name: MGMT_CIDR
          value: 172.24.0.0/16
        - name: MGMT_CIDR0
          value: 172.34.0.0/16
        - name: MGMT_CIDR1
          value: 172.44.0.0/16

The issue is that the order of those env vars may differ in each reconciliation loop, we may also get

$ oc get daemonsets.apps octavia-healthmanager  -o yaml | grep -A1 MGMT_CIDR
         - name: MGMT_CIDR
           value: 172.24.0.0/16
         - name: MGMT_CIDR0
           value: 172.44.0.0/16
         - name: MGMT_CIDR1
           value: 172.34.0.0/16

when the order changes, that changes the input parameters of the daemonset and recreates the pods.

This behavior is not 100% reproducible, and occurs randomly, it can trigger an infinite loop of pod recreation.

Expected behavior

the input parameters should be stable and only update the daemonset when necessary

Bug impact

octavia-healthmanager may be randomly restarted and make octavia unusable in DCN mode

Known workaround

Note

Octavia DCN is not officially supported in 18.0.4

links to

openstack-k8s-operators/octavia-operator#444: Fixed order of MGMT_CIDR<n> environment variables

RHBA-2025:146727 Release of containers for RHOSO OpenStack Podified operator

mentioned on

Merge request - Updated US source to: b3f18bf Merge pull request #447 from gthiemonge/18.0-fr1-fix-cidr-order

Assignee:: Gregory Thiemonge

Reporter:: Gregory Thiemonge

Team:: rhos-dfg-networking-squad-vans

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2025/02/11 9:24 AM

Updated:: 2025/03/20 8:18 AM

Resolved:: 2025/03/20 8:18 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty