Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-49845

[oauth-apiserver] Etcd client can unsafely retry timeouts on mutating requests

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.18.0, 4.19.0
    • oauth-apiserver
    • Yes
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Our carry patch intended to retry retriable requests that fail due to leader change will retry any etcd error with code "Unavailable": https://github.com/openshift/kubernetes/blob/4b2db1ec33faa3ffc305e5ffa7376908cc955370/staging/src/k8s.io/apiserver/pkg/storage/etcd3/etcd3retry/retry_etcdclient.go#L135-L145, but this includes reasons like "timeout" and does not distinguish between writes and reads. So a "timeout" error on a writing request might be retried even though a "timeout" observed by a client does not indicate that the effect of the write has not been persisted.

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

          

      Expected results:

          

      Additional info:

          

            [OCPBUGS-49845] [oauth-apiserver] Etcd client can unsafely retry timeouts on mutating requests

            Ke Wang added a comment - - edited

            Checked logs of CI job periodic-ci-openshift-release-master-ci-4.19-e2e-azure-ovn-serial, we found the following logs,

            2025-02-09 13:36:45 I0209 13:36:45.047661       1 retry_etcdclient.go:203] etcd retry - counter: 1, lastErrLabel: LeaderChanged lastError: etcdserver: leader changed, error: <nil>
            Fields
            app	openshift-oauth-apiserver
            container	oauth-apiserver
            detected_level	error
            host	ci-op-9hz3qq32-6593d-wbdpc-master-1
            invoker	openshift-internal-ci/periodic-ci-openshift-release-master-ci-4.19-e2e-azure-ovn-serial/1888565704416825344
            namespace	openshift-oauth-apiserver
            pod	apiserver-7874d654d8-qjnh2
            type	pod
            

            When oauth-apiserver clients retried, shows us clear reason for errors, not only etcd error with code "Unavailable". worked as expected, moving the bug VERIFIED.

            Ke Wang added a comment - - edited Checked logs of CI job periodic-ci-openshift-release-master-ci-4.19-e2e-azure-ovn-serial , we found the following logs, 2025-02-09 13:36:45 I0209 13:36:45.047661 1 retry_etcdclient.go:203] etcd retry - counter: 1, lastErrLabel: LeaderChanged lastError: etcdserver: leader changed, error: <nil> Fields app openshift-oauth-apiserver container oauth-apiserver detected_level error host ci-op-9hz3qq32-6593d-wbdpc-master-1 invoker openshift-internal-ci/periodic-ci-openshift-release-master-ci-4.19-e2e-azure-ovn-serial/1888565704416825344 namespace openshift-oauth-apiserver pod apiserver-7874d654d8-qjnh2 type pod When oauth-apiserver clients retried, shows us clear reason for errors, not only etcd error with code "Unavailable". worked as expected, moving the bug VERIFIED.

            Ke Wang added a comment -

            The associated PR has been included starting from 4.18.0-0.nightly-2025-02-06-003812

            Ke Wang added a comment - The associated PR has been included starting from 4.18.0-0.nightly-2025-02-06-003812

            Hi bluddy,

            Bugs should not be moved to Verified without first providing a Release Note Type("Bug Fix" or "No Doc Update") and for type "Bug Fix" the Release Note Text must also be provided. Please populate the necessary fields before moving the Bug to Verified.

            OpenShift Jira Bot added a comment - Hi bluddy , Bugs should not be moved to Verified without first providing a Release Note Type("Bug Fix" or "No Doc Update") and for type "Bug Fix" the Release Note Text must also be provided. Please populate the necessary fields before moving the Bug to Verified.

              bluddy Ben Luddy
              bluddy Ben Luddy
              Ke Wang Ke Wang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: