Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-25353

egressIP can not be applied to node with egress-assignable label on ROSA hosted cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • None
    • 4.14.z
    • None
    • No
    • SDN Sprint 248
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      egressIP can not be applied to node with egress-assignable label on ROSA hosted cluster

      Version-Release number of selected component (if applicable):

      14.4.5

      How reproducible:

       

      Steps to Reproduce:

      1. Create a ROSA hosted cluster, create new machine with egress-assignable label

       

      $ rosa create machinepool -c 283brr91hrb4onbamfobsk4bjq268hm8 --name=mp-61582 --labels=k8s.ovn.org/egress-assignable=true --replicas=2
      I: Fetching instance types
      I: Machine pool 'mp-61582' created successfully on hosted cluster '283brr91hrb4onbamfobsk4bjq268hm8'
      I: To view all machine pools, run 'rosa list machinepools -c 283brr91hrb4onbamfobsk4bjq268hm8'

       

      $ rosa list machinepools -c jechen-03-hcp
      ID        AUTOSCALING  REPLICAS  INSTANCE TYPE  LABELS                                TAINTS    AVAILABILITY ZONE  SUBNET                    VERSION  AUTOREPAIR  
      mp-61582  No           0/2       m5.xlarge      k8s.ovn.org/egress-assignable=true              us-west-2a         subnet-0d6d14c874b3803eb  4.14.5   Yes         
      workers   Yes          2/2-2     m5.xlarge      

       

      $ oc get node --show-labels | grep egress
      ip-10-0-132-68.us-west-2.compute.internal    Ready    worker   96s    v1.27.6+d548052   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-west-2,failure-domain.beta.kubernetes.io/zone=us-west-2a,hypershift.openshift.io/managed=true,hypershift.openshift.io/nodePool=jechen-03-hcp-mp-61582,k8s.ovn.org/egress-assignable=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-132-68.us-west-2.compute.internal,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-west-2a,topology.kubernetes.io/region=us-west-2,topology.kubernetes.io/zone=us-west-2a
      ip-10-0-143-229.us-west-2.compute.internal   Ready    worker   2m2s   v1.27.6+d548052   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-west-2,failure-domain.beta.kubernetes.io/zone=us-west-2a,hypershift.openshift.io/managed=true,hypershift.openshift.io/nodePool=jechen-03-hcp-mp-61582,k8s.ovn.org/egress-assignable=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-143-229.us-west-2.compute.internal,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-west-2a,topology.kubernetes.io/region=us-west-2,topology.kubernetes.io/zone=us-west-2a

       

      2. Create an egressIP object

      $ cat config_egressip1_ovn_ns_team_red_rosa.yaml
      apiVersion: k8s.ovn.org/v1
      kind: EgressIP
      metadata:
        name: egressip1
      spec:
        egressIPs:
        - 10.0.128.201 
        namespaceSelector:
          matchLabels:
            team: red 

       

      $ oc create -f  config_egressip1_ovn_ns_team_red_rosa.yaml
      egressip.k8s.ovn.org/egressip1 created

      $ oc get egressips.k8s.ovn.org 
      NAME        EGRESSIPS      ASSIGNED NODE   ASSIGNED EGRESSIPS
      egressip1   10.0.128.201                   

      3.

       

      Actual results:

      EgressIP was not assigned to egress node

      Expected results:

      egressIP should be assigned to egress node

      Additional info:

      Please fill in the following template while reporting a bug and provide as much relevant information as possible. Doing so will give us the best chance to find a prompt resolution.

      Affected Platforms:

      Is it an

      1. internal CI failure 
      2. customer issue / SD
      3. internal RedHat testing failure

       

      If it is an internal RedHat testing failure:

      • Please share a kubeconfig or creds to a live cluster for the assignee to debug/troubleshoot along with reproducer steps (specially if it's a telco use case like ICNI, secondary bridges or BM+kubevirt).

       

      If it is a CI failure:

       

      • Did it happen in different CI lanes? If so please provide links to multiple failures with the same error instance
      • Did it happen in both sdn and ovn jobs? If so please provide links to multiple failures with the same error instance
      • Did it happen in other platforms (e.g. aws, azure, gcp, baremetal etc) ? If so please provide links to multiple failures with the same error instance
      • When did the failure start happening? Please provide the UTC timestamp of the networking outage window from a sample failure run
      • If it's a connectivity issue,
      • What is the srcNode, srcIP and srcNamespace and srcPodName?
      • What is the dstNode, dstIP and dstNamespace and dstPodName?
      • What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)

       

      If it is a customer / SD issue:

       

      • Provide enough information in the bug description that Engineering doesn't need to read the entire case history.
      • Don't presume that Engineering has access to Salesforce.
      • Please provide must-gather and sos-report with an exact link to the comment in the support case with the attachment.  The format should be: https://access.redhat.com/support/cases/#/case/<case number>/discussion?attachmentId=<attachment id>
      • Describe what each attachment is intended to demonstrate (failed pods, log errors, OVS issues, etc).  
      • Referring to the attached must-gather, sosreport or other attachment, please provide the following details:
        • If the issue is in a customer namespace then provide a namespace inspect.
        • If it is a connectivity issue:
          • What is the srcNode, srcNamespace, srcPodName and srcPodIP?
          • What is the dstNode, dstNamespace, dstPodName and  dstPodIP?
          • What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
          • Please provide the UTC timestamp networking outage window from must-gather
          • Please provide tcpdump pcaps taken during the outage filtered based on the above provided src/dst IPs
        • If it is not a connectivity issue:
          • Describe the steps taken so far to analyze the logs from networking components (cluster-network-operator, OVNK, SDN, openvswitch, ovs-configure etc) and the actual component where the issue was seen based on the attached must-gather. Please attach snippets of relevant logs around the window when problem has happened if any.
      • For OCPBUGS in which the issue has been identified, label with "sbr-triaged"
      • For OCPBUGS in which the issue has not been identified and needs Engineering help for root cause, labels with "sbr-untriaged"
      • Note: bugs that do not meet these minimum standards will be closed with label "SDN-Jira-template"

              pdiak@redhat.com Patryk Diak
              jechen@redhat.com Jean Chen
              Jean Chen Jean Chen
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: