-
Bug
-
Resolution: Done
-
Normal
-
Logging 5.3.5
-
False
-
False
-
NEW
-
VERIFIED
-
-
Logging (Core) - Sprint 216, Logging (Core) - Sprint 217
Description of problem:
When only deploy CLO, the prometheus keeps reporting below errors:
ts=2022-03-01T05:30:18.294Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:449: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"openshift-logging\"" ts=2022-03-01T05:30:18.296Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:447: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"openshift-logging\"" ts=2022-03-01T05:30:18.301Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:448: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"openshift-logging\""
I checked the roles and clusterroles created when deploying CLO, no one grants the permission to system:serviceaccount:openshift-monitoring:prometheus-k8s.
After deploying EFK pods, the prometheus-k8s pod stopped reporting error, the targets serviceMonitor/openshift-logging/collector and serviceMonitor/openshift-logging/monitor-elasticsearch-cluster could be found in prometheus-k8s console, however, the target serviceMonitor/openshift-logging/cluster-logging-operator-metrics-monitor still didn't appear.
Version-Release number of selected component (if applicable):
cluster-logging.5.3.5-21
How reproducible:
Always
Steps to Reproduce:
1. deploy CLO
2. check pod logs in openshift-monitoring/prometheus-k8s-0
3.
Actual results:
Expected results:
Should not see above errors when the CLO is deployed, and the target serviceMonitor/openshift-logging/cluster-logging-operator-metrics-monitor can be found in prometheus-k8s console.
Additional info:
I checked the resources created when deploying EO, there has role/prometheus and and rolebinding/prometheus created in openshift-operators-redhat project, it granted below permission to system:serviceaccount:openshift-monitoring:prometheus-k8s:
apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: annotations: include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2022-03-01T05:29:57Z" labels: name: elasticsearch-operator name: prometheus namespace: openshift-operators-redhat resourceVersion: "89682" uid: 5e4521ac-6060-4d58-be7e-74e327f53e07 rules: - apiGroups: - "" resources: - services - endpoints - pods verbs: - get - list - watch $ oc get rolebinding -n openshift-operators-redhat prometheus -oyaml apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: annotations: include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2022-03-01T05:29:57Z" labels: name: elasticsearch-operator name: prometheus namespace: openshift-operators-redhat resourceVersion: "89665" uid: 63ea32c5-9bad-40b9-af08-e4acb15f4df9 roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: prometheus subjects: - kind: ServiceAccount name: prometheus-k8s namespace: openshift-monitoring
Besides, I found the clusterrole/elasticsearch-metrics and clusterrolebinding/elasticsearch-metrics were created after deploying elasticsearch pods, then the prometheus-k8s pod stopped reporting above error.
I also tried to only deploy collector pods, but the clusterrole/elasticsearch-metrics wasn't created.