Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9383

periodic latency spikes of inter-pod communication in OVN-kubernetes

XMLWordPrintable

    • Incidents & Support
    • None
    • None
    • None
    • Important
    • None
    • Unspecified
    • Hide
      5/5: still under investigation by compute and networking team in relation to a case.
      1/19: added to Telco 4.13 gating list per YQ
      12/20: removed from 4.12 gating list - still pending on data collection for the 4.8 issue, no indication currently that this is an issue in 4.12
      12/16: pending next step - is this pending data collection by customer, and this bug assigned to the correct component currently?
      12/15: pending next steps per latest comment
      11/30: changed Telco rank/bucket to 2, YJ checking into this one further (current status is same as 11/28)
      11/28: Waiting on data from reporter/Support Engineer.
      11/21: Yellow. Waiting on data from reporter
      11/16: added to the Telco-Grade OCP 4.12 gating list
      Rel Note for Telco: Not Required (4.12) - same reasons as in 12/20 comment - no indication there's an issue in 4.12
      Show
      5/5: still under investigation by compute and networking team in relation to a case. 1/19: added to Telco 4.13 gating list per YQ 12/20: removed from 4.12 gating list - still pending on data collection for the 4.8 issue, no indication currently that this is an issue in 4.12 12/16: pending next step - is this pending data collection by customer, and this bug assigned to the correct component currently? 12/15: pending next steps per latest comment 11/30: changed Telco rank/bucket to 2, YJ checking into this one further (current status is same as 11/28) 11/28: Waiting on data from reporter/Support Engineer. 11/21: Yellow. Waiting on data from reporter 11/16: added to the Telco-Grade OCP 4.12 gating list Rel Note for Telco: Not Required (4.12) - same reasons as in 12/20 comment - no indication there's an issue in 4.12
    • None
    • Rejected
    • None
    • Customer Escalated
    • None
    • If docs needed, set a value
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      • gRPC communication between application pods usually takes several tens of microseconds, but periodically takes a few milliseconds.
      • These latency spikes occur approximately every 5 seconds.
      • OVN-kubernetes is used in the cluster.
      • Customer observes similar behavior in Redis client-server communication. So this problem doesn't seem to be application-specific or gRPC-specific.

      Version-Release number of selected component (if applicable):
      4.8

      How reproducible:
      100% at the customer's site

      Steps to Reproduce:

      Actual results:
      Latency spikes occur approximately every 5 seconds

      Expected results:
      No latency spikes

      Additional info:

              msivak@redhat.com Martin Sivak
              rhn-support-yuokada Yuki Okada
              Yuki Okada
              None
              Gowrishankar Rajaiyan Gowrishankar Rajaiyan
              None
              Red Hat Employee
              Votes:
              0 Vote for this issue
              Watchers:
              25 Start watching this issue

                Created:
                Updated:
                Resolved: