Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-29327

OCP 4.12.46 - MTU migration fails to complete, CNO degraded

XMLWordPrintable

    • Important
    • No
    • SDN Sprint 249
    • 1
    • False
    • Hide

      None

      Show
      None

      Description of problem:

       - Customer has followed steps in documentation for migrating MTU on nodes (and has followed guidance from Red Hat, stepping through these steps together to ensure they are followed correctly):
      https://docs.openshift.com/container-platform/4.12/networking/changing-cluster-network-mtu.html

      • Migration steps concluded, including step 9 to patch remove migration + update MTU and we see this message:
      'Not applying unsafe configuration change: invalid configuration: [cannot change ovn-kubernetes MTU without migration]. Use ''oc edit network.operator.openshift.io   cluster'' to undo the change.
      

      Currently the network.operator.openshift.io object looks like so:

      spec:
        clusterNetwork:
        - cidr: 172.16.0.0/12
          hostPrefix: 23
        defaultNetwork:
          ovnKubernetesConfig:
            egressIPConfig: {}
            gatewayConfig:
              routingViaHost: false
            genevePort: 6081
            mtu: 1400
            policyAuditConfig:
              destination: "null"
              maxFileSize: 50
              rateLimit: 20
              syslogFacility: local0
          type: OVNKubernetes
        deployKubeProxy: false
        disableMultiNetwork: false
        disableNetworkDiagnostics: false
        logLevel: Normal
        managementState: Managed
        observedConfig: null
        operatorLogLevel: Normal
        serviceNetwork:
        - 192.168.0.0/16
        unsupportedConfigOverrides: null
        useMultiNetworkPolicy: false
      status:
        conditions:
        - lastTransitionTime: "2023-11-28T05:43:32Z"
          status: "False"
          type: ManagementStateDegraded
        - lastTransitionTime: "2024-02-09T12:26:02Z"
          message: 'Not applying unsafe configuration change: invalid configuration: [cannot
            change ovn-kubernetes MTU without migration]. Use ''oc edit network.operator.openshift.io
            cluster'' to undo the change.'
          reason: InvalidOperatorConfig
          status: "True"
          type: Degraded
      

      Looking at the machine-config update history, we can see that the migration was triggered automatically and `migrate-mtu.sh` was added automatically, then subsequently the machine-configs applied from butane were added (and a subsequent machine-config rollout applied them + removed migrate-mtu.sh")

      This appears to be a failure on the removal of MIGRATION block + update of MTU for the cluster itself only, nodes are all in desired (higher) MTU values.

      Currently blocked/cluster impacted as a result of mismatched MTU.

      Version-Release number of selected component (if applicable):

       

      How reproducible:
       

      Steps to Reproduce:

      1. Follow documentation to increase MTU on target cluster

      2. Validate configrations and files + apply them as outlined

      3. observe cluster stall + degraded CNO as a result of still-reduced MTU value due to missing migration block (removed by the patch command, but MTU change apply failed to follow).

       

      Actual results:

      degraded cluster
       

      Expected results:

       migrated MTU successfully is expected
      Additional info:

      additional data to follow in next update with template request.

              jcaamano@redhat.com Jaime Caamaño Ruiz
              rhn-support-wrussell Will Russell
              Anurag Saxena Anurag Saxena
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: