Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-3492

Long running API timeouts with OpenShift route on s390x

    XMLWordPrintable

Details

    • Important
    • Rejected
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      In CloudPak for Data, we have a long running API which is timing out on OCP version 4.10.25 (architecture - zLInux / s390x) when the API is called against the OCP route. It times out after 60 seconds when the route is either passthrough or reencrypt. The OCP route timeout has been extended to 900 seconds by adding an annotation to the route (haproxy.router.openshift.io/timeout: 900s). We have verified that the backend API NGINX processed the API successfully. The same API works on x86 clusters, but not on s390x clusters. It used to work fine with OCP version 4.8. 
      
      We have enabled HAProxy logs to see if there is anything abnormal. But the logs do not provide any valuable information on why it is timing out.
      
      Could you please let us know what might be going wrong. Is there any other timeout apart from the route annotation that is taking precedence ? Or is it an issue with zLinux which needs additional timeout configuration somewhere else. Please note that we also tried to extend the values of `spec.tuningOptions.serverTimeout` and `spec.tuningOptions.clientTimeout` in ingresscontroller/default, but they do not have any effect.
      
      
      Here is the OCP route defintion:
      
      ```
      apiVersion: route.openshift.io/v1
      kind: Route
      metadata:
        annotations:
          haproxy.router.openshift.io/balance: roundrobin
          haproxy.router.openshift.io/timeout: 900s
          openshift.io/host.generated: "true"
        creationTimestamp: "2022-11-07T10:52:59Z"
        name: cpd
        namespace: zen
      spec:
        host: cpd-zen.apps.ocp410cpd453nodes9.cp.fyre.ibm.com
        port:
          targetPort: ibm-nginx-https-port
        tls:
          insecureEdgeTerminationPolicy: Redirect
          termination: passthrough
        to:
          kind: Service
          name: ibm-nginx-svc
          weight: 100
        wildcardPolicy: None
      status:
        ingress:
        - conditions:
          - lastTransitionTime: "2022-11-07T10:52:59Z"
            status: "True"
            type: Admitted
          host: cpd-zen.apps.ocp410cpd453nodes9.cp.fyre.ibm.com
          routerCanonicalHostname: router-default.apps.ocp410cpd453nodes9.cp.fyre.ibm.com
          routerName: default
          wildcardPolicy: None
      ```
      
      The `default` Ingress Controller definition is below:
      
      ```
      Name:         default
      Namespace:    openshift-ingress-operator
      Labels:       <none>
      Annotations:  <none>
      API Version:  operator.openshift.io/v1
      Kind:         IngressControllerSpec:
        Client TLS:
          Client CA:
            Name:                     
          Client Certificate Policy:  
        Http Compression:
        Http Empty Requests Policy:  Respond
        Http Error Code Pages:
          Name:    
        Replicas:  2
        Tuning Options:
        Unsupported Config Overrides:  <nil>
      Status:
        Available Replicas:  2
        Conditions:
          Last Transition Time:  2022-11-07T07:20:18Z
          Reason:                Valid
          Status:                True
          Type:                  Admitted
          Last Transition Time:  2022-11-07T07:28:19Z
          Status:                True
          Type:                  PodsScheduled
          Last Transition Time:  2022-11-07T07:28:50Z
          Message:               The deployment has Available status condition set to True
          Reason:                DeploymentAvailable
          Status:                True
          Type:                  DeploymentAvailable
          Last Transition Time:  2022-11-07T07:28:50Z
          Message:               Minimum replicas requirement is met
          Reason:                DeploymentMinimumReplicasMet
          Status:                True
          Type:                  DeploymentReplicasMinAvailable
          Last Transition Time:  2022-11-10T12:35:31Z
          Message:               All replicas are available
          Reason:                DeploymentReplicasAvailable
          Status:                True
          Type:                  DeploymentReplicasAllAvailable
          Last Transition Time:  2022-11-07T07:20:19Z
          Message:               The configured endpoint publishing strategy does not include a managed load balancer
          Reason:                EndpointPublishingStrategyExcludesManagedLoadBalancer
          Status:                False
          Type:                  LoadBalancerManaged
          Last Transition Time:  2022-11-07T07:20:19Z
          Message:               No DNS zones are defined in the cluster dns config.
          Reason:                NoDNSZones
          Status:                False
          Type:                  DNSManaged
          Last Transition Time:  2022-11-07T07:28:50Z
          Status:                True
          Type:                  Available
          Last Transition Time:  2022-11-07T07:20:19Z
          Status:                False
          Type:                  Progressing
          Last Transition Time:  2022-11-07T07:28:50Z
          Status:                False
          Type:                  Degraded
          Last Transition Time:  2022-11-07T07:20:19Z
          Message:               IngressController is upgradeable.
          Reason:                Upgradeable
          Status:                True
          Type:                  Upgradeable
          Last Transition Time:  2022-11-07T07:29:18Z
          Message:               Canary route checks for the default ingress controller are successful
          Reason:                CanaryChecksSucceeding
          Status:                True
          Type:                  CanaryChecksSucceeding
        Domain:                  apps.ocp410cpd453nodes9.cp.fyre.ibm.com
        Endpoint Publishing Strategy:
          Host Network:
            Protocol:         TCP
          Type:               HostNetwork
        Observed Generation:  4
        Selector:             ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default
        Tls Profile:
          Ciphers:
            ECDHE-ECDSA-AES128-GCM-SHA256
            ECDHE-RSA-AES128-GCM-SHA256
            ECDHE-ECDSA-AES256-GCM-SHA384
            ECDHE-RSA-AES256-GCM-SHA384
            ECDHE-ECDSA-CHACHA20-POLY1305
            ECDHE-RSA-CHACHA20-POLY1305
            DHE-RSA-AES128-GCM-SHA256
            DHE-RSA-AES256-GCM-SHA384
            TLS_AES_128_GCM_SHA256
            TLS_AES_256_GCM_SHA384
            TLS_CHACHA20_POLY1305_SHA256
          Min TLS Version:  VersionTLS12
      Events:
        Type    Reason    Age                   From                Message
        ----    ------    ----                  ----                -------
        Normal  Admitted  131m (x3 over 3h16m)  ingress_controller  ingresscontroller passed validation
      ``` 
      
      
       

      Version-Release number of selected component (if applicable):

      4.10.25 s390x

      How reproducible:

       

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

       

      Attachments

        Activity

          People

            mmasters1@redhat.com Miciah Masters
            venkataramana.m@in.ibm.com venkata madugundu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: