-
Bug
-
Resolution: Unresolved
-
Minor
-
1.3.1
-
None
-
None
-
False
-
-
None
-
Important
-
None
-
None
-
None
-
None
-
None
-
None
After creating a new monitoringStack it can come to a race condition where the ClusterRoleBinding/RoleBinding is not applied fast enough and the prometheus pod fails to start:
create Pod prometheus-devops-monitoring-0 in StatefulSet prometheus-devops-monitoring failed error: pods "prometheus-devops-monitoring-0" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider "pipelines-scc": Forbidden: not usable by user or serviceaccount, provider "splunk-otel-collector": Forbidden: not usable by user or serviceaccount, provider restricted-v2: .spec.securityContext.fsGroup: Invalid value: []int64{65534}: 65534 is not an allowed group, provider restricted-v2: .initContainers[0].runAsUser: Invalid value: 65534: must be in the ranges: [1004790000, 1004799999], provider restricted-v2: .containers[0].runAsUser: Invalid value: 65534: must be in the ranges: [1004790000, 1004799999], provider restricted-v2: .containers[1].runAsUser: Invalid value: 65534: must be in the ranges: [1004790000, 1004799999], provider restricted-v2: .containers[2].runAsUser: Invalid value: 65534: must be in the ranges: [1004790000, 1004799999], provider restricted-v2: .containers[3].runAsUser: Invalid value: 65534: must be in the ranges: [1004790000, 1004799999], provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid-extimg-importer": Forbidden: not usable by user or serviceaccount, provider "elasticsearch-scc": Forbidden: not usable by user or serviceaccount, provider "logging-scc": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]
We can see in the source code in line 63 the prometheus CR is getting created, but the required ClusterRoleBinding/RoleBinding for Prometheus is getting created afterwards in Line 74/75 during the Alertmanager Deployment.
A similar issue has been reported in COO-1266. This issue is getting fixed within some seconds, but the events are causing alerting events for customers.