Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-43919

set ingresscontroller status.endpointPublishingStrategy and status.endpointPublishingStrategy.loadBalancer as null causes ingress operator panic

XMLWordPrintable

    • Moderate
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

          This is edge case since we just update ingresscontroller status directly for testing purpose, but still need a fix since that causes panic. Currently just found setting `status.endpointPublishingStrategy` and `status.endpointPublishingStrategy.loadBalancer` can cause ingress operator panic 
      
      

      Version-Release number of selected component (if applicable):

          4.18.0-0.nightly-2024-10-29-001120

      How reproducible:

          100%

      Steps to Reproduce:

          1. update ingresscontroller status with OCP cluster on Cloud Provider. 
      
      $ oc patch -n openshift-ingress-operator ingresscontrollers/extlb --subresource status --type=merge --patch='{"status":{"endpointPublishingStrategy":{"loadBalancer":null}}}'
      
      or
      
      $ oc patch -n openshift-ingress-operator ingresscontrollers/extlb --subresource status --type=merge --patch='{"status":{"endpointPublishingStrategy":null}}'
      
          2. check ingress operator status and logs
          3.
          

      Actual results:

          $ oc -n openshift-ingress-operator get pod
      NAME                                READY   STATUS             RESTARTS       AGE
      ingress-operator-5565b8dc9d-cs69l   1/2     CrashLoopBackOff   8 (119s ago)   67m
      
      $ oc -n openshift-ingress-operator logs ingress-operator-5565b8dc9d-cs69l
      panic: runtime error: invalid memory address or nil pointer dereference [recovered]
          panic: runtime error: invalid memory address or nil pointer dereference
      [signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x1a7b493]goroutine 1681 [running]:
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
          /ingress-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:111 +0x1e5
      panic({0x2885e40?, 0x4bd4e00?})
          /usr/lib/golang/src/runtime/panic.go:770 +0x132
      github.com/openshift/cluster-ingress-operator/pkg/resources/dnsrecord.desiredWildcardDNSRecord({{0xc00408e240?, 0x45dc98?}, {0xc004408f80?, 0x7ff32e36c498?}}, 0xc000a6be30, {{0xc0039c4300, 0x18}, {0x2d6102f, 0x11}, {0xc003fd0d56, ...}, ...}, ...)
          /ingress-operator/pkg/resources/dnsrecord/dns.go:114 +0x113
      github.com/openshift/cluster-ingress-operator/pkg/resources/dnsrecord.EnsureWildcardDNSRecord({0x343ce60, 0xc0012ee000}, {{0xc00408e240?, 0x0?}, {0xc004408f80?, 0x0?}}, 0xc003fd0aa0?, {{0xc0039c4300, 0x18}, {0x2d6102f, ...}, ...}, ...)
          /ingress-operator/pkg/resources/dnsrecord/dns.go:42 +0xdd
      github.com/openshift/cluster-ingress-operator/pkg/operator/controller/ingress.(*reconciler).ensureIngressController(0xc0012bf740, 0xc001336608, 0xc003791760, 0xc00012a008, 0xc003792b80, 0xc002ed49c0, 0xc003f60340, 0xc00012a248, 0xc002ed4b60)
          /ingress-operator/pkg/operator/controller/ingress/controller.go:1107 +0xe45
      github.com/openshift/cluster-ingress-operator/pkg/operator/controller/ingress.(*reconciler).Reconcile(0xc0012bf740, {0x3428908, 0xc0042739e0}, {{{0xc00032db20, 0x1a}, {0xc00140ab30, 0x5}}})
          /ingress-operator/pkg/operator/controller/ingress/controller.go:328 +0xcb5
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x3430aa0?, {0x3428908?, 0xc0042739e0?}, {{{0xc00032db20?, 0xb?}, {0xc00140ab30?, 0x0?}}})
      
      
      

      Expected results:

          no panic

      Additional info:

          Did similar test with other fields of status.endpointPublishingStrategy and found different behavior, no panic observed but might need to be investigated as well.
      
      1) For the field `.loadBalancer.dnsManagementPolicy` set it as null but got "not change" and I can see it still in the status part.
      
      2) The fields `.providerParameters.aws.classicLoadBalancer` (AWS CLB) or `.providerParameters.aws.networkLoadBalancer` (AWS NLB) can be set as null and cannot be restored, they are removed from status eventually.
      
      3) For hostNetwork, the fields including `httpPort` `httpsPort` `protocol` and `statsPort` can be set as null and can be restored, but if patch them again will see no change.
      
      // first update
      $ oc patch -n openshift-ingress-operator ingresscontrollers/extlb --subresource status --type=merge --patch='{"status":{"endpointPublishingStrategy":{"hostNetwork":{"statsPort":null}}}}'
      ingresscontroller.operator.openshift.io/extlb patched
      
      // second update
      ingresscontroller.operator.openshift.io/extlb patched (no change)   

       

              rh-ee-gpiotrow Grzegorz Piotrowski
              rhn-support-hongli Hongan Li
              Hongan Li Hongan Li
              Grant Spence
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: