Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-50507

Intermittent authentication issues when accessing OpenShift registry

    • Important
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

          Trying to pull image-registry.openshift-image-registry.svc:5000/ci-op-x1cji03c/pipeline@sha256:164133ec776adf612b6d3aab55c6f33809b02d4235c9cbc722be8b83d865e474...
      error: error creating buildah builder: initializing source docker://image-registry.openshift-image-registry.svc:5000/ci-op-x1cji03c/pipeline@sha256:164133ec776adf612b6d3aab55c6f33809b02d4235c9cbc722be8b83d865e474: unable to retrieve auth token: invalid username/password: authentication required

       

      The `invalid username/password: authentication required` occurs from multiple places, like a native Openshift build trying to push the image to the internal registry or a scheduler trying to inspect the image before scheduling the pod etc. 

      After some investigation, it was highlighted that after Openshift 4.16 the tokens are automatically rotated after some time. Trying to watch the secret it seems like a bad rotation mechanism is causing the issue.

       

      When the secret is getting updated, the old token stops working. 

      Example token rotation:

      OLD token: 
        "exp": 1739197039, // February 10, 2025 9:17:19 AM GMT-05:00
        "iat": 1739193439, // February 10, 2025 8:17:19 AM GMT-05:00
        "nbf": 1739193439, // February 10, 2025 8:17:19 AM GMT-05:00
      
      New token:
        "exp": 1739200698, // February 10, 2025 10:18:18 AM GMT-05:00
        "iat": 1739197098, // February 10, 2025  9:18:18 AM GMT-05:00
        "nbf": 1739197098, // February 10, 2025  9:18:18 AM GMT-05:00

      We also noticed that the secret got updated a minute after the old token expired. That leaves us with a window in time where there is no token working at all.

       

      Version-Release number of selected component (if applicable):

          OCP 4.16 and after

      How reproducible:

          Watch the builder-dockercfg-XXXX secret rotation process

      Steps to Reproduce:

          1.Watch the auto-created builder-dockercfg-XXX token
          2.After the secret will be updated, the old token can't authenticate to the registry
        
          

      Actual results:

          There is a window in time where the token is not valid

      Expected results:

          The token should be valid for a specific amount of time

       

      Additional info:

      Tracking https://search.dptools.openshift.org/?search=+unable+to+retrieve+auth+token%3A+invalid+username%2Fpassword%3A+authentication+required&maxAge=12h&context=1&type=build-log&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job to see active jobs that are hitting this issue.

              lusanche@redhat.com Luis Sanchez
              nmoraiti Nikolaos Moraitis
              XiuJuan Wang XiuJuan Wang
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: