-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
4.10.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
- Customers jobs started by the openshift-compliance operator failed to start. After investigating they noticed that the `rerunner` serviceaccount used by the jobs was missing.
- After checking the audit logs We can see that this sarvice account was deleted by the `Generic-garbage-collector` sarvice account which was running inside the Open Shift Cluster
~~~
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Metadata",
"auditID": "7a16f65d-6b43-4902-bba4-b44f577023d1",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/openshift-compliance/serviceaccounts/rerunner", <===============================
"verb": "delete",
"user": {
"username": "system:serviceaccount:kube-system:generic-garbage-collector" <========================================
"uid": "3341d976-5b73-476b-a2f8-80eda006477f",
"groups": [
"system:serviceaccounts",
"system:serviceaccounts:kube-system",
"system:authenticated"
]
},
"sourceIPs": [
"xx.xx.xx.xx"
],
"userAgent": "kube-controller-manager/v1.23.5+8471591 (linux/amd64) kubernetes/3c28e7a/system:serviceaccount:kube-system:generic-garbage-collector", <===========================================================
"objectRef": {
"resource": "serviceaccounts",
"namespace": "openshift-compliance",
"name": "rerunner",
"apiVersion": "v1"
},
"responseStatus": {
"code": 200
},
"requestReceivedTimestamp": "2022-11-02T14:56:55.678220Z",
"stageTimestamp": "2022-11-02T14:56:55.722259Z",
"annotations": {
"authorization.k8s.io/decision": "allow",
"authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"system:controller:generic-garbage-collector\" of ClusterRole \"system:controller:generic-garbage-collector\" to ServiceAccount \"generic-garbage-collector/kube-system\""
},
"k8s_audit_level": "Metadata",
"message": null,
"hostname": "ip-xx-xx-xx-xx.eu-west-1.compute.internal",
"pipeline_metadata": {
"collector": {
"ipaddr4": "xx.xx.xx.xx",
"inputname": "fluent-plugin-systemd",
"name": "fluentd",
"received_at": "2022-11-02T14:56:55.727973+00:00",
"version": "1.14.5 1.6.0"
}
},
"openshift": {
"sequence": 4530002
},
"@timestamp": "2022-11-02T14:56:55.678220+00:00",
"viaq_msg_id": "MDYxYWY5ZTMtY2E3My00MWYyLWFkOTctZTc1ZDIyYWFhY2Zh",
"log_type": "audit",
"fluentd_tag": "k8s-audit.log",
"fluentd_time": 1667401015.67822
}
~~~
Version-Release number of selected component (if applicable):
How reproducible:
Not reproducible happened only happened once
Actual results:
`Generic-garbage-collector` deleted the important Sarvice Account
Expected results:
`Generic-garbage-collector` should not delete this important Sarvice Account
Additional info:
I do find some old similar bugs related to garbage collector [1] https://bugzilla.redhat.com/show_bug.cgi?id=1679309 [2] https://github.com/kubernetes/kubernetes/issues/98471