Uploaded image for project: 'Red Hat Advanced Cluster Security'
  1. Red Hat Advanced Cluster Security
  2. ROX-30781

After restoring from backup config-controller pod is in CrashLoopBackOff due to additional-ca not mounted automatically inside the pod

    • False
    • Hide

      None

      Show
      None
    • False
    • Rox Sprint 4.9H - Global, Rox Sprint 4.9I - Global, Rox Sprint 4.10B
    • Important

      USER PROBLEM
       The customer is using RHACS 4.8 and custom CA and tried to restore central from the backup  but config-controller pod is in CrashLoopBackOff and reports the following error:

       

      pkg/client: 2025/09/01 02:08:53.433895 client.go:247: Warn: Initialization Error: could not exchange token: Failed to exchange token: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"

      Reproduction Steps:

      1. Backup central-tls secret
      2. Create a database backup of Central using roxctl
      3. Uninstall the old Central instance
      4. Restore central-tls secret in the new namespace 
      5.  Install the new Central instance
      6.   Restore the Central database backup using roxctl command mentioned below:
      7. Applied custom CA

      $ ./ca-setup.sh -f G4-All.pem
      W0901 11:18:10.543131 586060 helpers.go:692] --dry-run is deprecated and can be replaced with --dry-run=client.
      secret/additional-ca created
      secret/additional-ca labeled

      Two issues here:

      • It seems that back up and restore process doesn't preserve the custom CA and the customer has to manually apply custom CA in step 7
      • authentication handshake failed: x509: certificate signed by unknown authority" error reported in "config-controller" pod .
         

      $  oc get pod scanner-69f7774b94-q6kwp -o yaml  | grep -i "additional-ca"
            name: additional-ca-volume
        - name: additional-ca-volume
            secretName: additional-ca

      $  oc get pod scanner-v4-indexer-6695d7dcfc-7ngpl -o yaml  | grep -i "additional-ca"
            name: additional-ca-volume
        - name: additional-ca-volume
            secretName: additional-ca

      $  oc get  pod scanner-v4-matcher-6686c5dd-mr2fc -o yaml  | grep -i "additional-ca"
            name: additional-ca-volume
        - name: additional-ca-volume
            secretName: additional-ca

      $  oc get pod central-7dfc67cc68-9xw46 -o yaml  | grep -i "additional-ca"
            name: additional-ca-volume
        - name: additional-ca-volume
            secretName: additional-ca

      $  oc get pod config-controller-57975f8584-xrssf -o yaml  | grep -i "additional-ca" <=Report Nothing

      $  oc get pod config-controller-57975f8584-xrssf -o yaml  | grep -i "secret"
          - mountPath: /run/secrets/stackrox.io/certs/
          - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          secret:
            secretName: central-tls

      $ oc get pods central-7dfc67cc68-9xw46 -o yaml  | grep -i  -A1 "secret" | grep -i "central-tls"
            secretName: central-tls
            secretName: central-tls

      $ oc get pods central-7dfc67cc68-9xw46 -o yaml  | grep -i  -A1 "secret" | grep -i "central-default-tls"
            name: central-default-tls-cert-volume
        - name: central-default-tls-cert-volume
            secretName: central-default-tls-cert

       

      Based on the above outputs, all the running pods are using additional-ca  which is custom CA, but the config-controller pod has missing additional-ca so it's using default stackrox CA, which is not causing config-controller pod to start and report certificate signed by unknown CA

      Kcs Referred:
      https://access.redhat.com/solutions/6983933

      Slack discussion link with the Engineering team :

      https://redhat-internal.slack.com/archives/C028JE84N59/p1756773367629339 

       

      CONDITIONS
      What conditions need to exist for a user to be affected? Is it everyone? Is it only those with a specific integration? Is it specific to someone with particular database content? etc.

      • pending

      ROOT CAUSE
      What is the root cause of the bug?

      • pending

      FIX
      How was the bug fixed (this is more important if a workaround was implemented rather than an actual fix)?

      • pending

              rh-ee-klape Kyle Lape
              sasakshi@redhat.com Sakshi sakshi
              ACS Core Workflows
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: