Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-11381

[upgrade] failed to start node network manager: failed to start default network controller: multiple gateway interfaces detected: br-ex

XMLWordPrintable

    • Critical
    • No
    • SDN Sprint 234, SDN Sprint 235
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      4.12->4.13 upgrade on IPI dual stack BM wtock with various operators in degrade state
      
      # oc get co | grep -v "True.*False.*False"
      NAME                                       VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      authentication                             4.13.0-0.nightly-2023-04-04-034432   True        False         True       31h     APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver (2 containers are waiting in pending apiserver-79885856d5-fqn76 pod)...
      dns                                        4.13.0-0.nightly-2023-04-04-034432   True        True          True       32h     DNS default is degraded
      ingress                                    4.13.0-0.nightly-2023-04-04-034432   True        True          True       28h     The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: DeploymentReplicasAllAvailable=False (DeploymentReplicasNotAvailable: 1/2 of replicas are available)
      kube-storage-version-migrator              4.13.0-0.nightly-2023-04-04-034432   False       True          False      5h7m    KubeStorageVersionMigratorAvailable: Waiting for Deployment
      machine-config                             4.12.10                              False       True          True       4h57m   Cluster not available for [{operator 4.12.10}]: error during waitForDeploymentRollout: [timed out waiting for the condition, deployment machine-config-controller is not ready. status: (replicas: 1, updated: 1, ready: 0, unavailable: 1)]
      monitoring                                 4.13.0-0.nightly-2023-04-04-034432   False       True          True       5h5m    reconciling Prometheus Operator Admission Webhook Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator-admission-webhook: got 1 unavailable replicas
      network                                    4.13.0-0.nightly-2023-04-04-034432   True        True          True       32h     DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - pod ovnkube-node-8dm9z is in CrashLoopBackOff State...
      openshift-apiserver                        4.13.0-0.nightly-2023-04-04-034432   True        False         True       32h     APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver (3 containers are waiting in pending apiserver-54456dcb4c-2wmtb pod)
      
      
      Workers ovnkube-node in CLBO state complaining
      
      
      I0404 15:27:18.754890  532079 gateway_init.go:261] Initializing Gateway Functionality
      I0404 15:27:18.755112  532079 gateway_localnet.go:174] Node local addresses initialized to: map[10.1.235.25:{10.1.235.0 ffffff00} 10.128.0.2:{10.128.0.0 fffffe00} 127.0.0.1:{127.0.0.0 ff000000} ::1:{::1 ffffffffffffffffffffffffffffffff} fd02:0:0:1::2:{fd02:0:0:1:: ffffffffffffffff0000000000000000} fe80::2c63:1eff:fe02:578c:{fe80:: ffffffffffffffff0000000000000000} fe80::80e2:c9ff:fe9f:1231:{fe80:: ffffffffffffffff0000000000000000} fe80::f602:70ff:feb8:d8f0:{fe80:: ffffffffffffffff0000000000000000}]
      I0404 15:27:18.755193  532079 helper_linux.go:69] Provided gateway interface "br-ex", found as index: 12
      I0404 15:27:18.755256  532079 helper_linux.go:94] Found default gateway interface br-ex 10.1.235.254
      I0404 15:27:18.755287  532079 helper_linux.go:69] Provided gateway interface "br-ex", found as index: 12
      I0404 15:27:18.755401  532079 metrics.go:504] Stopping metrics server 127.0.0.1:29103
      I0404 15:27:18.755455  532079 reflector.go:227] Stopping reflector *v1.Pod (0s) from k8s.io/client-go/informers/factory.go:150
      F0404 15:27:18.755482  532079 ovnkube.go:137] failed to start node network manager: failed to start default network controller: multiple gateway interfaces detected: br-ex 
      
      

      Version-Release number of selected component (if applicable):

      4.13.0-0.nightly-2023-04-04-034432

      How reproducible:

      Intermittent

      Steps to Reproduce:

      1.Install IPI dual stack BM cluster on 4.12.10
      2.Prepare pre upgrade testdata
      3.Perform upgrade to 4.13.0-0.nightly-2023-04-04-034432 

      Actual results:

      Upgrade failed with various operators in degrade state

      Expected results:

      Upgrade should be successful

      Additional info:

      Workers br-ex interface doesn't get IPv6 global/link assigned. NM dispacther script complains failed/warn messages. Please see logs for more details 

       

            mkennell@redhat.com Martin Kennelly
            anusaxen Anurag Saxena
            Anurag Saxena Anurag Saxena
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: