Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-6418

Cluster logging operator could not create SCC logging-scc

    • False
    • None
    • False
    • NEW
    • NEW
    • Hide
      Before this update, a potential race condition related to the creation of the SecurityContextConstraint (SCC) caused the Cluster Logging Operator to fail in creating the logging-scc. This issue did not always reproduce and could occur during the initialization of the Collector and LogFileMetricExporter components.
      With this update, the use of a non-cached client resolves the issue by ensuring the actual state of the object is retrieved directly from the API, bypassing any cached objects. This change eliminates unpredictable behavior, allowing the Cluster Logging Operator to reliably create the logging-scc.
      Show
      Before this update, a potential race condition related to the creation of the SecurityContextConstraint (SCC) caused the Cluster Logging Operator to fail in creating the logging-scc. This issue did not always reproduce and could occur during the initialization of the Collector and LogFileMetricExporter components. With this update, the use of a non-cached client resolves the issue by ensuring the actual state of the object is retrieved directly from the API, bypassing any cached objects. This change eliminates unpredictable behavior, allowing the Cluster Logging Operator to reliably create the logging-scc.
    • Bug Fix
    • Log Collection - Sprint 263, Log Collection - Sprint 264, Log Collection - Sprint 265
    • Important

      Description of problem:

      The operator is deployed via automation process which includes creation of openshift-logging namespace, operatorgroup, subscription. The operator pod starts up fine and gets reconciled.

      After creating CLF, collector pods are not deployed and cluster-logging operator  streams below errors on loop:

      2024-11-18T11:21:51.500044647Z {"_ts":"2024-11-18T11:21:51.499998528Z","_level":"0","_component":"cluster-logging-operator","_message":"reconcile.SecurityContextConstraints","_error":{"msg":"failed to get /logging-scc SecurityContextConstraints: failed to get restmapping: no matches for kind \"SecurityContextConstraints\" in group \"security.openshift.io\""}}
      2024-11-18T11:21:51.500044647Z {"_ts":"2024-11-18T11:21:51.500032891Z","_level":"0","_component":"cluster-logging-operator_controller.observability","_message":"reconcile error","_error":{"msg":"failed to get /logging-scc SecurityContextConstraints: failed to get restmapping: no matches for kind \"SecurityContextConstraints\" in group \"security.openshift.io\""}}
      2024-11-18T11:21:51.506128772Z {"_ts":"2024-11-18T11:21:51.506070113Z","_level":"0","_component":"cluster-logging-operator","_message":"Reconciler error","ClusterLogForwarder":{"name":"instance","namespace":"openshift-logging"},"_error":{"msg":"failed to get /logging-scc SecurityContextConstraints: failed to get restmapping: no matches for kind \"SecurityContextConstraints\" in group \"security.openshift.io\""},"controller":"clusterlogforwarder","controllerGroup":"observability.openshift.io","controllerKind":"ClusterLogForwarder","name":"instance","namespace":"openshift-logging","reconcileID":"acc50de9-c7d4-4b03-a497-6948152a5853"}
      2024-11-18T11:21:51.511723237Z {"_ts":"2024-11-18T11:21:51.511689402Z","_level":"0","_component":"cluster-logging-operator","_message":"reconcile.SecurityContextConstraints","_error":{"msg":"failed to get /logging-scc SecurityContextConstraints: failed to get restmapping: no matches for kind \"SecurityContextConstraints\" in group \"security.openshift.io\""}} 

      The SCC logging-scc is not created, but restarting the clo pod helps in workaround of the problem.

      Version-Release number of selected component (if applicable):

      Red Hat OpenShift Logging 6.1

      Red Hat OpenShift Container Platform 4.16.15

      How reproducible:

      It is intermittent and is not reproducible 100%.

      Steps to Reproduce:

      1.  NA
      2.  
      3. ...

      Actual results:

      After creating ClusterLogForwarder, collector pods are not not deployed and cluster-logging-operator pod streams below errors:

      2024-11-18T11:29:05.093714682Z {"_ts":"2024-11-18T11:29:05.093704479Z","_level":"0","_component":"cluster-logging-operator_controller.observability","_message":"reconcile error","_error":{"msg":"failed to get /logging-scc SecurityContextConstraints: failed to get restmapping: no matches for kind \"SecurityContextConstraints\" in group \"security.openshift.io\""}}
      2024-11-18T11:29:05.099206192Z {"_ts":"2024-11-18T11:29:05.099174475Z","_level":"0","_component":"cluster-logging-operator","_message":"Reconciler error","ClusterLogForwarder":{"name":"instance","namespace":"openshift-logging"},"_error":{"msg":"failed to get /logging-scc SecurityContextConstraints: failed to get restmapping: no matches for kind \"SecurityContextConstraints\" in group \"security.openshift.io\""},"controller":"clusterlogforwarder","controllerGroup":"observability.openshift.io","controllerKind":"ClusterLogForwarder","name":"instance","namespace":"openshift-logging","reconcileID":"b453b18d-a2b3-4f64-9372-6e61daf48b2e" 

      Expected results:

      The creation of SCC should happen automatically by the operator and it should keep watching if the SCC exists or not. If it doesn't exist then the operator should trigger creation of logging-scc SCC resource.

      Additional info:

      Restarting the CLO pod fixes the creation of SCC logging-scc.

      Adding important note from below:   This is reproducible on two customer clusters so far and both are SNO

            [LOG-6418] Cluster logging operator could not create SCC logging-scc

            Errata Tool added a comment -

            Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

            For information on the advisory (Logging for Red Hat OpenShift - 6.1.2), and where to find the updated files, follow the link below.

            If the solution does not work for you, open a new bug report.
            https://access.redhat.com/errata/RHBA-2025:1229

            Errata Tool added a comment - Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (Logging for Red Hat OpenShift - 6.1.2), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2025:1229

            Anping Li added a comment -

            Verified using cluster-logging-rhel9-operator/images/v6.1.2-5. I can reproduce this issue on the SingleNode cluster with multiple CLF deployed. With this fix, you may still see message "failed to get /logging-scc SecurityContextConstraints:" after the node is restarted. Wait for a while, CLO will reconcile the CLF and the error disappers.

            Anping Li added a comment - Verified using cluster-logging-rhel9-operator/images/v6.1.2-5. I can reproduce this issue on the SingleNode cluster with multiple CLF deployed. With this fix, you may still see message "failed to get /logging-scc SecurityContextConstraints:" after the node is restarted. Wait for a while, CLO will reconcile the CLF and the error disappers.

            CPaaS Service Account mentioned this issue in merge request !4885 of openshift-logging / Log Collection Midstream on branch openshift-logging-6.2-rhel-9_upstream_ec981c5bca05d0ea3a7e0d9773535d33:

            Updated US source to: 05f73c7 enhance sysctl test to handle multiple expected outputs

            GitLab CEE Bot added a comment - CPaaS Service Account mentioned this issue in merge request !4885 of openshift-logging / Log Collection Midstream on branch openshift-logging-6.2-rhel-9_ upstream _ec981c5bca05d0ea3a7e0d9773535d33 : Updated US source to: 05f73c7 enhance sysctl test to handle multiple expected outputs

            CPaaS Service Account mentioned this issue in merge request !4877 of openshift-logging / Log Collection Midstream on branch openshift-logging-6.2-rhel-9_upstream_bda3d7ecb81de41d0c4608d5037d8fed:

            Updated US source to: 850346b LOG-6508: Emit stream labels following OTel Semantic Conventions as a forward compatibility measure

            GitLab CEE Bot added a comment - CPaaS Service Account mentioned this issue in merge request !4877 of openshift-logging / Log Collection Midstream on branch openshift-logging-6.2-rhel-9_ upstream _bda3d7ecb81de41d0c4608d5037d8fed : Updated US source to: 850346b LOG-6508 : Emit stream labels following OTel Semantic Conventions as a forward compatibility measure

            CPaaS Service Account mentioned this issue in merge request !4857 of openshift-logging / Log Collection Midstream on branch openshift-logging-6.2-rhel-9_upstream_cdc12e5c66db49a5ff524c338ced0a66:

            Updated US source to: b8085fb Use uncached reader for reconciliation SCC to avoid unexpected behavior

            GitLab CEE Bot added a comment - CPaaS Service Account mentioned this issue in merge request !4857 of openshift-logging / Log Collection Midstream on branch openshift-logging-6.2-rhel-9_ upstream _cdc12e5c66db49a5ff524c338ced0a66 : Updated US source to: b8085fb Use uncached reader for reconciliation SCC to avoid unexpected behavior

            CPaaS Service Account mentioned this issue in merge request !4853 of openshift-logging / Log Collection Midstream on branch openshift-logging-6.1-rhel-9_upstream_366814df828080c62c2637ff29759aa7:

            Updated US source to: 6e959d6 Use uncached reader for reconciliation SCC to avoid unexpected behavior

            GitLab CEE Bot added a comment - CPaaS Service Account mentioned this issue in merge request !4853 of openshift-logging / Log Collection Midstream on branch openshift-logging-6.1-rhel-9_ upstream _366814df828080c62c2637ff29759aa7 : Updated US source to: 6e959d6 Use uncached reader for reconciliation SCC to avoid unexpected behavior

            Setting appropriate fix version to 6.2.z This can be included in 6.2.0 release, once verified.

            Casey Hartman added a comment - Setting appropriate fix version to 6.2.z This can be included in 6.2.0 release, once verified.

            This issue requires Release Notes Text. Please modify the Release Note Text or set the Release Note Type to "Release Note Not Required"

            Jeffrey Cantrill added a comment - This issue requires Release Notes Text. Please modify the Release Note Text or set the Release Note Type to "Release Note Not Required"

            Including note/findings from linked slack conversation:  

            The restmapper is responsible for internal requests to the api-server for resources:

            • In the controller, the logging-scc will only be re-created during reconcile if a not found error type occurs.   The restmapping error is not of that type, and is the reason the controller is not re-creating it.... the scc already exists.
            • Modifying the error handling should allow us to resolve the issue, but we should also investigate and acknowledge the root cause.

            From my limited knowledge, this error is a result of request caching.  The controller creates the logging-scc, then it re-loops and runs reconcile.   During that reconcile, the request is using a cached copy of the client where the logging-scc has not been mapped (throws an error of type "unknown" rather than "not found").
            I'm not sure how to fix the caching, but it seems there is a resolution approach here as well.

            Casey Hartman added a comment - Including note/findings from linked slack conversation:   The restmapper is responsible for internal requests to the api-server for resources: In the controller, the logging-scc will only be re-created during reconcile if a not found error type occurs.   The restmapping error is not of that type, and is the reason the controller is not re-creating it.... the scc already exists. Modifying the error handling should allow us to resolve the issue, but we should also investigate and acknowledge the root cause. From my limited knowledge, this error is a result of request caching.  The controller creates the logging-scc, then it re-loops and runs reconcile.   During that reconcile, the request is using a cached copy of the client where the logging-scc has not been mapped (throws an error of type "unknown" rather than "not found"). I'm not sure how to fix the caching, but it seems there is a resolution approach here as well.

              vparfono Vitalii Parfonov
              rhn-support-dgautam Dhruv Gautam
              Anping Li Anping Li
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: