-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.13.z
-
None
-
Moderate
-
None
-
False
-
Description of problem:
On one cluster, customer is seeing that the "aide-openshift-file-integrity-daemon" Pod is constantly force-initialzing th AIDE db:
2024-07-19T07:11:24Z: force-initializing AIDE db 2024-07-19T07:11:49Z: initialization finished 2024-07-19T07:16:32Z: force-initializing AIDE db 2024-07-19T07:16:57Z: initialization finished 2024-07-19T07:21:00Z: force-initializing AIDE db 2024-07-19T07:21:25Z: initialization finished 2024-07-19T07:21:39Z: force-initializing AIDE db 2024-07-19T07:22:03Z: initialization finished 2024-07-19T07:26:45Z: force-initializing AIDE db 2024-07-19T07:27:09Z: initialization finished 2024-07-19T07:31:50Z: force-initializing AIDE db 2024-07-19T07:32:15Z: initialization finished 2024-07-19T07:36:19Z: running aide check 2024-07-19T07:36:44Z: aide check returned status 0 2024-07-19T07:36:56Z: force-initializing AIDE db 2024-07-19T07:37:21Z: initialization finished 2024-07-19T07:37:40Z: force-initializing AIDE db 2024-07-19T07:38:05Z: initialization finished 2024-07-19T07:42:04Z: force-initializing AIDE db
In the Operator Pod, similar logs to the following can be seen:
{"level":"info","ts":"2024-07-19T07:11:21Z","logger":"controller_node","msg":"Added Reinit annotation to FileIntegrity for node","Node.Name":"example-master-0","node":"gp-b2c-we-bm7bg-master-0","fi":"openshift-file-integrity"} {"level":"info","ts":"2024-07-19T07:11:21Z","logger":"controller_status","msg":"reconciling FileIntegrityStatus","Request.Namespace":"openshift-file-integrity","Request.Name":"openshift-file-integrity"} {"level":"info","ts":"2024-07-19T07:11:21Z","logger":"controller_fileintegrity","msg":"reconciling FileIntegrity","Request.Namespace":"openshift-file-integrity","Request.Name":"openshift-file-integrity"} {"level":"info","ts":"2024-07-19T07:11:21Z","logger":"controller_fileintegrity","msg":"re-init daemonSet created, triggered by demand or nodes","Request.Namespace":"openshift-file-integrity","Request.Name":"openshift-file-integrity","nodes":"example-master-0"} {"level":"info","ts":"2024-07-19T07:11:21Z","logger":"controller_fileintegrity","msg":"Annotating AIDE config to be updated.","Request.Namespace":"openshift-file-integrity","Request.Name":"openshift-file-integrity"} [..] {"level":"info","ts":"2024-07-19T07:11:21Z","logger":"controller_fileintegrity","msg":"Re-init annotation failed to be removed, re-queueing","Request.Namespace":"openshift-file-integrity","Request.Name":"openshift-file-integrity"}
Reviewing the messages, it seems that some messages are more common than others:
$ cat file-integrity-operator-658c566ff7-49mn4-file-integrity-operator.log | jq '.msg' | sort | uniq -c | sort -n 2 "FileIntegrity daemon configuration changed - pods restarted." 2 "FileIntegrity needed DaemonSet command-line arguments update" 9 "error updating FileIntegrity status" 12 "Reconciler error" 401 "Added Reinit annotation to FileIntegrity for node" 401 "Removed Holdoff annotation from FileIntegrity for node" 522 "Re-init annotation failed to be removed, re-queueing" 797 "re-init daemonSet created, triggered by demand or nodes" 811 "Updating status" 1325 "Removing re-init annotation." 1328 "Annotating AIDE config to be updated." 2402 "Will reconcile FI because its config changed" 2650 "reconciling FileIntegrity" 4126 "reconciling re-init" 7438 "Reconciling ConfigMap" 7653 "Node is up-to-date. Degraded: false" 7659 "Reconciling Node" 16645 "reconciling FileIntegrityStatus"
FileIntegrity object is configured like this:
spec:
config:
gracePeriod: 900
initialDelay: 60
maxBackups: 5
tolerations:
- operator: Exists
GitOps and RHACM have been disabled for any FIO objects, so they should not be interfering.
Version-Release number of selected component (if applicable):
OpenShift Container Platform 4.13
File Integrity Operator Version 1.3.4
How reproducible:
On one cluster on customer side
Steps to Reproduce:
1. Set up File Integrity Operator 1.3.4
2. Configure File Integrity Operator with FileIntegrity object above
Actual results:
AIDE db is re-initialized less than once per hour
Expected results:
AIDE db is re-initialized every few minutes
Additional info:
- "oc adm inspect" is available in Support Case 03876622 (comment #13)
- Operator Logs and AIDE Pod Logs are available in Support Case 03876622 (comment #8, #9)
- Internal Slack thread: https://redhat-internal.slack.com/archives/C02CD2UPKQE/p1721054924456849