Uploaded image for project: 'FlightPath'
  1. FlightPath
  2. FLPATH-2796

Cost Management Operator CA bundle not mounted - TLS cert verification fails

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False

      Description

      The setup-cost-mgmt-tls.sh script creates a comprehensive CA bundle ConfigMap (combined-ca-bundle) with all necessary certificates for self-signed cert environments, but fails to mount this ConfigMap in the Cost Management Operator deployment. This causes TLS certificate verification failures when the operator attempts to authenticate with Keycloak.

      Root Cause

      The script extracts and bundles CA certificates correctly but does not patch the ClusterServiceVersion (CSV) to:
      1. Mount the combined-ca-bundle ConfigMap as a volume
      2. Set the SSL_CERT_FILE environment variable for Go's HTTP client

      As a result, the operator's HTTP client cannot verify Keycloak's self-signed certificate.

      Error Message

      tls: failed to verify certificate: x509: certificate signed by unknown authority
      Post "https://keycloak-rhsso.apps.insights.qe.lab.redhat.com/auth/realms/kubernetes/protocol/openid-connect/token": tls: failed to verify certificate: x509: certificate signed by unknown authority
      

      Impact

      • Cost Management Operator cannot obtain JWT tokens from Keycloak
      • Operator fails to authenticate and upload metrics data
      • Metrics collection to ROS ingress is blocked

      The Fix

      The CA bundle ConfigMap exists and contains all required certificates - it just needs to be mounted in the operator pod.

      Workaround Script: scripts/patch-cost-mgmt-csv.sh

      h1.!/bin/bash
      h1. Minimal script to patch Cost Management Operator CSV with CA bundle
      
      set -e
      
      NAMESPACE="${1:-costmanagement-metrics-operator}"
      
      h1. Get CSV name
      CSV_NAME=$(oc get csv -n "$NAMESPACE" --no-headers | grep costmanagement-metrics-operator | awk '{print $1}' | head -1)
      
      h1. Create temp file
      TEMP=$(mktemp)
      
      h1. Patch CSV
      oc get csv "$CSV_NAME" -n "$NAMESPACE" -o json | \
        jq '.spec.install.spec.deployments[0].spec.template.spec.volumes += [{"name":"ca-bundle","configMap":{"name":"combined-ca-bundle"}}]' | \
        jq '.spec.install.spec.deployments[0].spec.template.spec.containers[0].volumeMounts += [{"name":"ca-bundle","mountPath":"/etc/pki/tls/certs/combined-ca-bundle.crt","subPath":"ca-bundle.crt","readOnly":true}]' | \
        jq '.spec.install.spec.deployments[0].spec.template.spec.containers[0].env += [{"name":"SSL_CERT_FILE","value":"/etc/pki/tls/certs/combined-ca-bundle.crt"}]' \
        > "$TEMP"
      
      h1. Apply
      oc apply -f "$TEMP"
      
      h1. Wait for rollout
      sleep 10
      oc rollout status deployment/costmanagement-metrics-operator -n "$NAMESPACE" --timeout=300s
      
      h1. Cleanup
      rm "$TEMP"
      
      echo "Done. CSV patched and deployment rolled out."
      

      Permanent Fix Required

      The setup-cost-mgmt-tls.sh script needs a new function patch_csv_with_ca_bundle() that:
      1. Retrieves the Cost Management Operator CSV
      2. Adds the CA bundle volume configuration
      3. Adds the volume mount to the manager container
      4. Sets the SSL_CERT_FILE environment variable
      5. Applies the updated CSV
      6. Waits for OLM to reconcile and deployment to roll out

      This should be called after update_ca_certificates in the main execution flow.

      Steps to Reproduce

        1. Run ./scripts/setup-cost-mgmt-tls.sh on an OpenShift cluster with self-signed certificates
          2. Check operator logs: oc logs -n costmanagement-metrics-operator deployment/costmanagement-metrics-operator
          3. Observe TLS certificate verification errors when connecting to Keycloak

      Verification After Fix

      After applying the workaround script:

      h1. Verify CA bundle is mounted
      POD=$(oc get pods -n costmanagement-metrics-operator -l app=costmanagement-metrics-operator -o name | head -1)
      oc exec -n costmanagement-metrics-operator $POD -- ls -lh /etc/pki/tls/certs/combined-ca-bundle.crt
      
      h1. Verify environment variable
      oc exec -n costmanagement-metrics-operator $POD -- env | grep SSL_CERT_FILE
      
      h1. Check logs - should see successful Keycloak connections
      oc logs -n costmanagement-metrics-operator deployment/costmanagement-metrics-operator | grep -i keycloak
      

      Environment

              Unassigned Unassigned
              chadcrum Chad Crum
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: