Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-42204

Cannot create NNCP using NMstate operator: "Bad TLS certificate chain" error.

XMLWordPrintable

    • Critical
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Today I discovered I cannot create a new NodeNetworkConfigurationPolicy:
      
      Error from server (InternalError): error when creating "NNCP.yaml": Internal error occurred: failed calling webhook "nodenetworkconfigurationpolicies-mutate.nmstate.io": failed to call webhook: Post "https://nmstate-webhook.openshift-nmstate.svc:443/nodenetworkconfigurationpolicies-mutate?timeout=10s": tls: failed to verify certificate: x509: certificate signed by unknown authority
      
      
      The nmstate-cert-manager pod is logging this very quickly:
      
      {"level":"info","ts":"2024-09-19T10:15:20.343Z","logger":"certificate/Manager","msg":"Bad TLS certificate chain, forcing rotation: failed verifying TLS secret openshift-nmstate/nmstate-webhook: CA bundle and CA secret certificate are different","webhookType":"Mutating","webhookName":"nmstate"}
      {"level":"info","ts":"2024-09-19T10:15:20.343Z","logger":"certificate/Manager","msg":"Bad TLS certificate chain, forcing rotation: failed verifying TLS secret openshift-nmstate/nmstate-webhook: CA bundle and CA secret certificate are different","webhookType":"Mutating","webhookName":"nmstate"}
      {"level":"info","ts":"2024-09-19T10:15:20.343Z","logger":"certificate/Manager","msg":"elapsedToRotateCAFromLastDeadline {now: 2024-09-19 10:15:20.343431226 +0000 UTC m=+379.773562336, deadline: 2024-09-19 10:15:20.343431166 +0000 UTC m=+379.773562276, elapsedToRotate: -60ns}","webhookType":"Mutating","webhookName":"nmstate"}
      {"level":"info","ts":"2024-09-19T10:15:20.343Z","logger":"certificate/Manager","msg":"Certificate expiration is 2025-03-20 22:15:20 +0000 UTC, totalDuration is 1.5768001e+16, rotation deadline is 2025-03-19 22:15:20 +0000 UTC","webhookType":"Mutating","webhookName":"nmstate"}
      {"level":"info","ts":"2024-09-19T10:15:20.343Z","logger":"certificate/Manager","msg":"elapsedToRotateServicesFromLastDeadline{now: 2024-09-19 10:15:20.343622976 +0000 UTC m=+379.773754086, deadline: 2025-03-19 22:15:20 +0000 UTC, elapsedToRotate: 4355h59m59.656377024s}","webhookType":"Mutating","webhookName":"nmstate"}
      {"level":"info","ts":"2024-09-19T10:15:20.343Z","logger":"certificate/Manager.earliestElapsedForCACertsCleanup","msg":"{now: 2024-09-19 10:15:20.343654385 +0000 UTC m=+379.773785485, deadline: 2026-11-14 05:53:01 +0000 UTC, elapsedForCleanup: 18859h37m40.656345615s}","webhookType":"Mutating","webhookName":"nmstate"}
      {"level":"info","ts":"2024-09-19T10:15:20.343Z","logger":"certificate/Manager.earliestElapsedForServiceCertsCleanup","msg":"{now: 2024-09-19 10:15:20.343695882 +0000 UTC m=+379.773826983, deadline: 2025-03-20 22:15:20 +0000 UTC, elapsedForCleanup: 4379h59m59.656304118s}","webhookType":"Mutating","webhookName":"nmstate","service":{"name":"nmstate-webhook","namespace":"openshift-nmstate"}}
      {"level":"info","ts":"2024-09-19T10:15:20.343Z","logger":"certificate/Manager","msg":"Calculating RequeueAfter","webhookType":"Mutating","webhookName":"nmstate","elapsedToRotateCA":-0.00000006,"elapsedToRotateServices":15681599.656377023,"elapsedForCABundleCleanup":67894660.65634562,"elapsedForServiceCertsCleanup":15767999.656304117}
      {"level":"info","ts":"2024-09-19T10:15:20.343Z","logger":"certificate/Manager","msg":"Certificates will be Reconcile on 2024-09-19 10:15:20.34371045 +0000 UTC m=+379.773841560","webhookType":"Mutating","webhookName":"nmstate"}
      {"level":"info","ts":"2024-09-19T10:15:20.343Z","logger":"certificate/Manager.Reconcile","msg":"Reconciling Certificates","webhookType":"Mutating","webhookName":"nmstate","Request.Namespace":"openshift-nmstate","Request.Name":"nmstate-ca"}
      
      
      The nmstate-ca secret is being recreated several times per second, causing a high CPU usage.

      Version-Release number of selected component (if applicable):

      OCP 4.16.11
      Kubernetes nmstate operator 4.16.0-202409111235
      
      nmstate-cert-manager image:
      
      registry.redhat.io/openshift4/ose-kubernetes-nmstate-handler-rhel9@sha256:a33cc00576fc6a7b30d36c6b91d6524eef53700f8b4706e2fd5b83d61791511e
      nmstate-webhook image:
      
      registry.redhat.io/openshift4/ose-kubernetes-nmstate-handler-rhel9@sha256:00eb91c1ff12cbf5c1cf0dfbc5476ff2fc78ad24c62e1fb3f1352d9bf51cc980 

      How reproducible:

      Happened all the time in my cluster

      Steps to Reproduce:

       1. Try to create a NNCP.
       2. Check the logs of nmstate-cert-manager pod.
          

      Actual results:

      - Cannot create NNCP.
      - nmstate-ca secret being recreated at a high rate.
      
      

      Expected results:

      NMstate using the correct CA cert.

      Additional info:

      To fix the problem, I had to uninstalling the operator, delete the namespace and reinstall it.
      
      After that, the nmstate-cert-manager pod no longer exists.

      Upstream bug: https://github.com/nmstate/kubernetes-nmstate/issues/1264

            mkowalsk@redhat.com Mat Kowalski
            rhn-support-jortialc Juan Orti
            Qiong Wang Qiong Wang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: