Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-17581

Submariner 0.19.2 E2E is failing the gateway redundancy test in Jenkins with OCP 4.17

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Important
    • None

       an 30 21:08:28.770: Ensure active gateway node "submariner-gateway-x4jm7" has established connections
      Jan 30 21:08:29.029: Found submariner endpoint for "submqe-aws": &v1.Endpoint{TypeMeta:v1.TypeMeta{Kind:"Endpoint", APIVersion:"submariner.io/v1"}, ObjectMeta:v1.ObjectMeta{Name:"submqe-aws-submariner-cable-submqe-aws-10-0-114-39", GenerateName:"", Namespace:"submariner-operator", SelfLink:"", UID:"c7c40b49-896c-4096-90fb-d423db0a7710", ResourceVersion:"690022", Generation:1, CreationTimestamp:time.Date(2025, time.January, 30, 20, 59, 57, 0, time.Local), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"submariner-gateway", Operation:"Update", APIVersion:"submariner.io/v1", Time:time.Date(2025, time.January, 30, 20, 59, 57, 0, time.Local), FieldsType:"FieldsV1", FieldsV1:(*v1.FieldsV1)(0xc00072ed98), Subresource:""}}}, Spec:v1.EndpointSpec{ClusterID:"submqe-aws", CableName:"submariner-cable-submqe-aws-10-0-114-39", HealthCheckIP:"10.131.2.2", Hostname:"ip-10-0-114-39", Subnets:[]string{"172.30.0.0/16", "10.128.0.0/14"}, PrivateIP:"10.0.114.39", PublicIP:"3.16.152.78", NATEnabled:true, Backend:"libreswan", BackendConfig:map[string]string{"natt-discovery-port":"4490", "preferred-server":"false", "udp-port":"4505"}}}
      Jan 30 21:08:29.286: Performing fail-over to passive gateway
      Jan 30 21:08:29.543: Jan 30 21:08:29.542: INFO: ExecWithOptions &{Command:[sh -c echo 1 > /proc/sys/kernel/sysrq && echo b > /proc/sysrq-trigger] Namespace:submariner-operator PodName:submariner-gateway-x4jm7 ContainerName:submariner-gateway Stdin:<nil> CaptureStdout:false CaptureStderr:true PreserveWhitespace:false}
      
      Jan 30 21:09:31.017: Jan 30 21:09:31.017: INFO: Retrying due to error  Timeout occurred
      

      Basically the ExecWithOptions keeps timin out. I can see the pod restart and node also switching to NotReady. Gateway also failovers, but it keeps trying to ExecWithOptions 5 times [with 5 sec delay between each try]. I tried increasin timeout to even 1 minute and doesn't help.

      not an issue on 4.16 so something changed with 4.17

              rhn-support-jchhatba Janki Chhatbar
              rhn-support-pyadav Prachi Yadav
              Prachi Yadav Prachi Yadav
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: