-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
4.10.z
-
Moderate
-
None
-
False
-
Description of problem:
- Customers jobs started by the openshift-compliance operator failed to start. After investigating they noticed that the `rerunner` serviceaccount used by the jobs was missing. - After checking the audit logs We can see that this sarvice account was deleted by the `Generic-garbage-collector` sarvice account which was running inside the Open Shift Cluster ~~~ { "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Metadata", "auditID": "7a16f65d-6b43-4902-bba4-b44f577023d1", "stage": "ResponseComplete", "requestURI": "/api/v1/namespaces/openshift-compliance/serviceaccounts/rerunner", <=============================== "verb": "delete", "user": { "username": "system:serviceaccount:kube-system:generic-garbage-collector" <======================================== "uid": "3341d976-5b73-476b-a2f8-80eda006477f", "groups": [ "system:serviceaccounts", "system:serviceaccounts:kube-system", "system:authenticated" ] }, "sourceIPs": [ "xx.xx.xx.xx" ], "userAgent": "kube-controller-manager/v1.23.5+8471591 (linux/amd64) kubernetes/3c28e7a/system:serviceaccount:kube-system:generic-garbage-collector", <=========================================================== "objectRef": { "resource": "serviceaccounts", "namespace": "openshift-compliance", "name": "rerunner", "apiVersion": "v1" }, "responseStatus": { "code": 200 }, "requestReceivedTimestamp": "2022-11-02T14:56:55.678220Z", "stageTimestamp": "2022-11-02T14:56:55.722259Z", "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"system:controller:generic-garbage-collector\" of ClusterRole \"system:controller:generic-garbage-collector\" to ServiceAccount \"generic-garbage-collector/kube-system\"" }, "k8s_audit_level": "Metadata", "message": null, "hostname": "ip-xx-xx-xx-xx.eu-west-1.compute.internal", "pipeline_metadata": { "collector": { "ipaddr4": "xx.xx.xx.xx", "inputname": "fluent-plugin-systemd", "name": "fluentd", "received_at": "2022-11-02T14:56:55.727973+00:00", "version": "1.14.5 1.6.0" } }, "openshift": { "sequence": 4530002 }, "@timestamp": "2022-11-02T14:56:55.678220+00:00", "viaq_msg_id": "MDYxYWY5ZTMtY2E3My00MWYyLWFkOTctZTc1ZDIyYWFhY2Zh", "log_type": "audit", "fluentd_tag": "k8s-audit.log", "fluentd_time": 1667401015.67822 } ~~~
Version-Release number of selected component (if applicable):
How reproducible:
Not reproducible happened only happened once
Actual results:
`Generic-garbage-collector` deleted the important Sarvice Account
Expected results:
`Generic-garbage-collector` should not delete this important Sarvice Account
Additional info:
I do find some old similar bugs related to garbage collector [1] https://bugzilla.redhat.com/show_bug.cgi?id=1679309 [2] https://github.com/kubernetes/kubernetes/issues/98471