Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-5036

Cloud Controller Managers do not react to changes in configuration leading to assorted errors

    XMLWordPrintable

Details

    • ?
    • CLOUD Sprint 231, CLOUD Sprint 232
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      Previously Cluster Cloud Controller Manager Operator did not react properly to changing Cloud Controller Manager configuration and/or credentials which lead to outdated configurations and the inability of CCMs to contact and underlying cloud. Now, CCCMO (Cluster Cloud Controller Manager Operator)
      restarts its operands (cloud controllers) in case related configs or credentials were changed.
      Show
      Previously Cluster Cloud Controller Manager Operator did not react properly to changing Cloud Controller Manager configuration and/or credentials which lead to outdated configurations and the inability of CCMs to contact and underlying cloud. Now, CCCMO (Cluster Cloud Controller Manager Operator) restarts its operands (cloud controllers) in case related configs or credentials were changed.
    • Bug Fix

    Description

      cloud-controller-manager does not react to changes to infrastructure secrets (in the OpenStack case: clouds.yaml).
      As a consequence, if credentials are rotated (and the old ones are rendered useless), load balancer creation and deletion will not succeed any more. Restarting the controller fixes the issue on a live cluster.

      Logs show that it couldn't find the application credentials:

      Dec 19 12:58:58.909: INFO: At 2022-12-19 12:53:58 +0000 UTC - event for udp-lb-default-svc: {service-controller } EnsuringLoadBalancer: Ensuring load balancer
      Dec 19 12:58:58.909: INFO: At 2022-12-19 12:53:58 +0000 UTC - event for udp-lb-default-svc: {service-controller } SyncLoadBalancerFailed: Error syncing load balancer: failed to ensure load balancer: failed to get subnet to create load balancer for service e2e-test-openstack-q9jnk/udp-lb-default-svc: Unable to re-authenticate: Expected HTTP response code [200 204 300] when accessing [GET https://compute.rdo.mtl2.vexxhost.net/v2.1/0693e2bb538c42b79a49fe6d2e61b0fc/servers/fbeb21b8-05f0-4734-914e-926b6a6225f1/os-interface], but got 401 instead
      {"error": {"code": 401, "title": "Unauthorized", "message": "The request you have made requires authentication."}}: Resource not found: [POST https://identity.rdo.mtl2.vexxhost.net/v3/auth/tokens], error message: {"error":{"code":404,"message":"Could not find Application Credential: 1b78233956b34c6cbe5e1c95445972a4.","title":"Not Found"}}

      OpenStack CI has been instrumented to restart CCM after credentials rotation, so that we silence this particular issue and avoid masking any other. That workaround must be reverted once this bug is fixed.

      Attachments

        Issue Links

          Activity

            People

              dmoiseev Denis Moiseev (Inactive)
              maandre@redhat.com Martin André
              Itay Matza Itay Matza
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: