Uploaded image for project: 'Red Hat Advanced Cluster Security'
  1. Red Hat Advanced Cluster Security
  2. ROX-33084

TLS failure when doing local image scanning in co-located deployments during CA rotation

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • False

      USER PROBLEM
      What is the user experiencing as a result of the bug? Include steps to reproduce.

      Sensor fails to scan OCP internal registry images with x509 certificate errors:
      transport: authentication handshake failed: verifying Scanner V4 Indexer certificate errors: [x509: certificate signed by unknown authority, x509: certificate signed by unknown authority]
      Errors appear in both Sensor and Admission Control logs when pods using image-registry.openshift-image-registry.svc:5000/... images are deployed.

      CONDITIONS
      What conditions need to exist for a user to be affected? Is it everyone? Is it only those with a specific integration? Is it specific to someone with particular database content? etc.

      • Central and SecuredCluster deployed in the same namespace
      • CA rotation has occurred (~3 years after initial deployment)
      • SecuredCluster has refreshed its certificates (getting the secondary CA)
      • User deploys pods with OCP internal registry images

      ROOT CAUSE
      What is the root cause of the bug?

      During CA rotation, Central creates a secondary CA. When Sensor refreshes its certs, it receives the secondary CA. However, Scanner V4 Indexer (a Central component) still uses the primary CA. In co-located deployments, Sensor connects directly to Central's Scanner V4 Indexer for local image scanning.

      The mTLS handshake fails because:

      • Sensor's cert is signed by secondary CA, Scanner V4 Indexer only trusts primary CA
      • Scanner V4 Indexer's cert is signed by primary CA, Sensor only trusts secondary CA

      The CA rotation design didn't properly account for Sensor ↔ Scanner V4 Indexer communication in co-located deployments.

      FIX
      How was the bug fixed (this is more important if a workaround was implemented rather than an actual fix)?

      Proposed fix:
      Central operator: Write ca-secondary.pem to all Central component TLS secrets (not just central-tls), so Scanner V4 Indexer trusts both CAs
      Sensor: Sensor knows Central's CA (via TLSChallenge), so add them both to the trust pool when creating a Scanner V4 client connection

              rh-ee-vbologa Vlad Bologa
              rh-ee-vbologa Vlad Bologa
              ACS Install
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: