Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-20210

Invalid egressIP object caused ovnkube-node pods CLBO

    XMLWordPrintable

Details

    • Low
    • No
    • SDN Sprint 243, SDN Sprint 244
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required
    • In Progress

    Description

      Description of problem:

      Invalid egressIP object caused ovnkube-node pods CLBO
      

      Version-Release number of selected component (if applicable):

      4.14.0-0.nightly-2023-10-05-195247
      

      How reproducible:

      Always
      
      

      Steps to Reproduce:

      1. Label one node as egress node
      2. Created an egressIP object, with empty label key and value
      oc get egressip -o yaml
      apiVersion: v1
      items:
      - apiVersion: k8s.ovn.org/v1
        kind: EgressIP
        metadata:
          creationTimestamp: "2023-10-07T09:08:28Z"
          generation: 2
          name: egressip-test
          resourceVersion: "122021"
          uid: 23445450-37d5-4ec3-b8fe-d8352a19e703
        spec:
          egressIPs:
          - 10.0.70.100
          namespaceSelector:
            matchLabels:
              "": ""
          podSelector:
            matchLabels:
              "": ""
        status:
          items:
          - egressIP: 10.0.70.100
            node: ip-10-0-70-135
      kind: List
      metadata:
        resourceVersion: ""
      
      3. Created namespace and test pods
      
      

      Actual results:

      Test pods was stuck in ContainerCreating status  
      % oc get pods -n hrw
      NAME            READY   STATUS              RESTARTS   AGE
      test-rc-hwmns   0/1     ContainerCreating   0          45s
      test-rc-p9kl8   0/1     ContainerCreating   0          45s
       % oc describe pod test-rc-hwmns   -n hrw
      Name:             test-rc-hwmns
      Namespace:        hrw
      Priority:         0
      Service Account:  default
      Node:             ip-10-0-70-125/10.0.70.125
      Start Time:       Sat, 07 Oct 2023 17:08:50 +0800
      Labels:           name=test-pods
      Annotations:      k8s.ovn.org/pod-networks:
                          {"default":{"ip_addresses":["10.129.2.11/23"],"mac_address":"0a:58:0a:81:02:0b","gateway_ips":["10.129.2.1"],"routes":[{"dest":"10.128.0.0...
                        openshift.io/scc: restricted-v2
                        seccomp.security.alpha.kubernetes.io/pod: runtime/default
      Status:           Pending
      IP:               
      IPs:              <none>
      Controlled By:    ReplicationController/test-rc
      Containers:
        test-pod:
          Container ID:   
          Image:          quay.io/openshifttest/hello-sdn@sha256:c89445416459e7adea9a5a416b3365ed3d74f2491beb904d61dc8d1eb89a72a4
          Image ID:       
          Port:           <none>
          Host Port:      <none>
          State:          Waiting
            Reason:       ContainerCreating
          Ready:          False
          Restart Count:  0
          Limits:
            memory:  340Mi
          Requests:
            memory:     340Mi
          Environment:  <none>
          Mounts:
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-7vlz8 (ro)
      Conditions:
        Type              Status
        Initialized       True 
        Ready             False 
        ContainersReady   False 
        PodScheduled      True 
      Volumes:
        kube-api-access-7vlz8:
          Type:                    Projected (a volume that contains injected data from multiple sources)
          TokenExpirationSeconds:  3607
          ConfigMapName:           kube-root-ca.crt
          ConfigMapOptional:       <nil>
          DownwardAPI:             true
          ConfigMapName:           openshift-service-ca.crt
          ConfigMapOptional:       <nil>
      QoS Class:                   Burstable
      Node-Selectors:              <none>
      Tolerations:                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                                   node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                   node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
      Events:
        Type     Reason                  Age   From               Message
        ----     ------                  ----  ----               -------
        Normal   Scheduled               59s   default-scheduler  Successfully assigned hrw/test-rc-hwmns to ip-10-0-70-125
        Warning  FailedCreatePodSandBox  59s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_test-rc-hwmns_hrw_d72a4216-b94b-4034-a9f7-526758055994_0(1ad74472b9e985cee4a3081f5912b3d4553351d14764d3bfece1d174146f90ca): error adding pod hrw_test-rc-hwmns to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: '&{ContainerID:1ad74472b9e985cee4a3081f5912b3d4553351d14764d3bfece1d174146f90ca Netns:/var/run/netns/131f3670-1a49-4088-9002-5624a3acc6d3 IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=hrw;K8S_POD_NAME=test-rc-hwmns;K8S_POD_INFRA_CONTAINER_ID=1ad74472b9e985cee4a3081f5912b3d4553351d14764d3bfece1d174146f90ca;K8S_POD_UID=d72a4216-b94b-4034-a9f7-526758055994 Path: StdinData:[123 34 98 105 110 68 105 114 34 58 34 47 118 97 114 47 108 105 98 47 99 110 105 47 98 105 110 34 44 34 99 104 114 111 111 116 68 105 114 34 58 34 47 104 111 115 116 114 111 111 116 34 44 34 99 108 117 115 116 101 114 78 101 116 119 111 114 107 34 58 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 99 110 105 47 110 101 116 46 100 47 49 48 45 111 118 110 45 107 117 98 101 114 110 101 116 101 115 46 99 111 110 102 34 44 34 99 110 105 67 111 110 102 105 103 68 105 114 34 58 34 47 104 111 115 116 47 101 116 99 47 99 110 105 47 110 101 116 46 100 34 44 34 99 110 105 86 101 114 115 105 111 110 34 58 34 48 46 51 46 49 34 44 34 100 97 101 109 111 110 83 111 99 107 101 116 68 105 114 34 58 34 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 44 34 103 108 111 98 97 108 78 97 109 101 115 112 97 99 101 115 34 58 34 100 101 102 97 117 108 116 44 111 112 101 110 115 104 105 102 116 45 109 117 108 116 117 115 44 111 112 101 110 115 104 105 102 116 45 115 114 105 111 118 45 110 101 116 119 111 114 107 45 111 112 101 114 97 116 111 114 34 44 34 108 111 103 76 101 118 101 108 34 58 34 118 101 114 98 111 115 101 34 44 34 108 111 103 84 111 83 116 100 101 114 114 34 58 116 114 117 101 44 34 109 117 108 116 117 115 65 117 116 111 99 111 110 102 105 103 68 105 114 34 58 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 99 110 105 47 110 101 116 46 100 34 44 34 109 117 108 116 117 115 67 111 110 102 105 103 70 105 108 101 34 58 34 97 117 116 111 34 44 34 110 97 109 101 34 58 34 109 117 108 116 117 115 45 99 110 105 45 110 101 116 119 111 114 107 34 44 34 110 97 109 101 115 112 97 99 101 73 115 111 108 97 116 105 111 110 34 58 116 114 117 101 44 34 112 101 114 78 111 100 101 67 101 114 116 105 102 105 99 97 116 101 34 58 123 34 98 111 111 116 115 116 114 97 112 75 117 98 101 99 111 110 102 105 103 34 58 34 47 104 111 115 116 114 111 111 116 47 118 97 114 47 108 105 98 47 107 117 98 101 108 101 116 47 107 117 98 101 99 111 110 102 105 103 34 44 34 99 101 114 116 68 105 114 34 58 34 47 101 116 99 47 99 110 105 47 109 117 108 116 117 115 47 99 101 114 116 115 34 44 34 99 101 114 116 68 117 114 97 116 105 111 110 34 58 34 50 52 104 34 44 34 101 110 97 98 108 101 100 34 58 116 114 117 101 125 44 34 115 111 99 107 101 116 68 105 114 34 58 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 44 34 116 121 112 101 34 58 34 109 117 108 116 117 115 45 115 104 105 109 34 125]} ContainerID:"1ad74472b9e985cee4a3081f5912b3d4553351d14764d3bfece1d174146f90ca" Netns:"/var/run/netns/131f3670-1a49-4088-9002-5624a3acc6d3" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=hrw;K8S_POD_NAME=test-rc-hwmns;K8S_POD_INFRA_CONTAINER_ID=1ad74472b9e985cee4a3081f5912b3d4553351d14764d3bfece1d174146f90ca;K8S_POD_UID=d72a4216-b94b-4034-a9f7-526758055994" Path:"" ERRORED: error configuring pod [hrw/test-rc-hwmns] networking: [hrw/test-rc-hwmns/d72a4216-b94b-4034-a9f7-526758055994:ovn-kubernetes]: error adding container to network "ovn-kubernetes": failed to send CNI request: Post "http://dummy/": dial unix /var/run/ovn-kubernetes/cni//ovn-cni-server.sock: connect: connection refused
      '
       
      % oc get pods -n openshift-ovn-kubernetes                         
      NAME                                     READY   STATUS             RESTARTS        AGE
      ovnkube-control-plane-85f96b444b-2bdwf   2/2     Running            0               5h27m
      ovnkube-control-plane-85f96b444b-2mhfj   2/2     Running            0               5h27m
      ovnkube-control-plane-85f96b444b-ddjhx   2/2     Running            0               5h27m
      ovnkube-node-5fkb5                       7/8     CrashLoopBackOff   6 (2m52s ago)   13m
      ovnkube-node-p7qvr                       7/8     CrashLoopBackOff   6 (2m56s ago)   13m
      ovnkube-node-tzhlb                       7/8     CrashLoopBackOff   6 (2m51s ago)   13m
      ovnkube-node-x5849                       7/8     CrashLoopBackOff   6 (2m57s ago)   13m
      ovnkube-node-xscbr                       7/8     CrashLoopBackOff   6 (2m35s ago)   13m
      
          exec /usr/bin/ovnkube --init-ovnkube-controller "${K8S_NODE}" --init-node "${K8S_NODE}" \
              --config-file=/run/ovnkube-config/ovnkube.conf \
              --ovn-empty-lb-events \
              --loglevel "${OVN_KUBE_LOG_LEVEL}" \
              --inactivity-probe="${OVN_CONTROLLER_INACTIVITY_PROBE}" \
              ${gateway_mode_flags} \
              ${node_mgmt_port_netdev_flags} \
              --metrics-bind-address "127.0.0.1:29103" \
              --ovn-metrics-bind-address "127.0.0.1:29105" \
              --metrics-enable-pprof \
              --metrics-enable-config-duration \
              --export-ovs-metrics \
              --disable-snat-multiple-gws \
              ${export_network_flows_flags} \
              ${multi_network_enabled_flag} \
              ${multi_network_policy_enabled_flag} \
              ${admin_network_policy_enabled_flag} \
              --enable-multicast \
              --zone ${K8S_NODE} \
              --enable-interconnect \
              --acl-logging-rate-limit "20" \
              ${gw_interface_flag} \
              --enable-multi-external-gateway=true \
              ${ip_forwarding_flag} \
              ${NETWORK_NODE_IDENTITY_ENABLE}
            
          State:       Waiting
            Reason:    CrashLoopBackOff
          Last State:  Terminated
            Reason:    Error
            Message:   vn-kubernetes/go-controller/pkg/retry.(*RetryFramework).WatchResourceFiltered.func1.1({0xc0007cb368, 0x11})
                       /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/retry/obj_retry.go:531 +0x2c7
      github.com/ovn-org/ovn-kubernetes/go-controller/pkg/retry.(*RetryFramework).DoWithLock(0xc000d4eb40, {0xc0007cb368, 0x11}, 0xc000e43dd0)
        /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/retry/obj_retry.go:137 +0xce
      github.com/ovn-org/ovn-kubernetes/go-controller/pkg/retry.(*RetryFramework).WatchResourceFiltered.func1({0x22eede0, 0xc000c6fec0})
        /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/retry/obj_retry.go:504 +0x265
      k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(...)
        /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/cache/controller.go:243
      k8s.io/client-go/tools/cache.FilteringResourceEventHandler.OnAdd({0xc00111bdc0?, {0x26d0aa0?, 0xc001580570?}}, {0x22eede0, 0xc000c6fec0}, 0xa0?)
        /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/cache/controller.go:306 +0x6e
      github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*Handler).OnAdd(...)
        /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:52
      github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.newQueuedInformer.func1.1(0xc000e43da0?)
      
            Exit Code:    2
            Started:      Sat, 07 Oct 2023 17:14:38 +0800
            Finished:     Sat, 07 Oct 2023 17:14:39 +0800
          Ready:          False
          Restart Count:  6
          Requests:
            cpu:      10m
            memory:   600Mi
      
      
      
      

      Expected results:

      Add some checking point about labels ? Give the warning that the key should not be empty and not able to apply? 
      
      

      Additional info:

      
      

      Attachments

        Activity

          People

            pdiak@redhat.com Patryk Diak
            huirwang Huiran Wang
            Huiran Wang Huiran Wang
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: