Loading...

Type: Bug
Resolution: Unresolved
Priority: Minor
Fix Version/s: None
Affects Version/s: 4.5.5
Component/s: RHACS
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:
PX Impact Score:
PX Review Complete:

Sprint:
Rox Sprint 4.10D

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

USER PROBLEM
The RHACS collector pods are stuck in the CrashLoopBackOff state.

CONDITIONS

[1] RHACS collector pods

$ oc get pods -A | grep collector

rhacs-operator  collector-479q2  2/3     CrashLoopBackOff   4025 (4m41s ago)   14d 
rhacs-operator  collector-gqv7d  2/3     CrashLoopBackOff   4021 (85s ago)     14d 
rhacs-operator  collector-kxh6z  2/3     CrashLoopBackOff   4039 (3m28s ago)   14d 
rhacs-operator  collector-rm7vm  2/3     CrashLoopBackOff   4026 (118s ago)    14d

[2] Namespace Events:

$ oc get events -n rhacs-operator 
LAST SEEN TYPE REASON OBJECT MESSAGE 71s Normal Pulling pod/collector-479q2 Pulling image "registry.redhat.io/advanced-cluster-security/rhacs-collector-rhel8@sha256:f36004ced010b8fbb1e3241a4b09e158c74a70ddc74db4b0aa70ffc96df78182" 2s Warning BackOff pod/collector-479q2 Back-off restarting failed container collector in pod collector-479q2_rhacs-operator(40cd5ac5-cae2-4efd-91d8-737294af4e7b) 120m Normal Pulled pod/collector-479q2 (combined from similar events): Successfully pulled image "registry.redhat.io/advanced-cluster-security/rhacs-collector-rhel8@sha256:f36004ced010b8fbb1e3241a4b09e158c74a70ddc74db4b0aa70ffc96df78182" in 717ms (717ms including waiting). Image size: 128314781 bytes. 3m3s Normal Pulling pod/collector-gqv7d Pulling image "registry.redhat.io/advanced-cluster-security/rhacs-collector-rhel8@sha256:f36004ced010b8fbb1e3241a4b09e158c74a70ddc74db4b0aa70ffc96df78182" 12s Warning BackOff pod/collector-gqv7d Back-off restarting failed container collector in pod collector-gqv7d_rhacs-operator(17b83d63-665b-45ef-98c6-716355974bfb) 65m Normal Pulled pod/collector-gqv7d (combined from similar events): Successfully pulled image "registry.redhat.io/advanced-cluster-security/rhacs-collector-rhel8@sha256:f36004ced010b8fbb1e3241a4b09e158c74a70ddc74db4b0aa70ffc96df78182" in 551ms (551ms including waiting). Image size: 128314781 bytes. 5m5s Normal Pulling pod/collector-kxh6z Pulling image "registry.redhat.io/advanced-cluster-security/rhacs-collector-rhel8@sha256:f36004ced010b8fbb1e3241a4b09e158c74a70ddc74db4b0aa70ffc96df78182" 4m12s Warning BackOff pod/collector-kxh6z Back-off restarting failed container collector in pod collector-kxh6z_rhacs-operator(ecca9670-837f-4df3-b460-7a8a9f223c7b) 134m Normal Pulled pod/collector-kxh6z (combined from similar events): Successfully pulled image "registry.redhat.io/advanced-cluster-security/rhacs-collector-rhel8@sha256:f36004ced010b8fbb1e3241a4b09e158c74a70ddc74db4b0aa70ffc96df78182" in 596ms (596ms including waiting). Image size: 128314781 bytes. 3m35s Normal Pulling pod/collector-rm7vm Pulling image "registry.redhat.io/advanced-cluster-security/rhacs-collector-rhel8@sha256:f36004ced010b8fbb1e3241a4b09e158c74a70ddc74db4b0aa70ffc96df78182" 98s Warning BackOff pod/collector-rm7vm Back-off restarting failed container collector in pod collector-rm7vm_rhacs-operator(cb196cca-9681-4179-ae68-80456ce23c81) 96m Normal Pulled pod/collector-rm7vm (combined from similar events): Successfully pulled image "registry.redhat.io/advanced-cluster-security/rhacs-collector-rhel8@sha256:f36004ced010b8fbb1e3241a4b09e158c74a70ddc74db4b0aa70ffc96df78182" in 357ms (357ms including waiting). Image size: 128314781 bytes.

[3] Pod Logs:

$ oc -n rhacs-operator logs collector-xxx

[WARNING 2025/11/07 03:39:08] libbpf: prog 'sys_exit': -- BEGIN PROG LOAD LOG --
processed 270 insns (limit 1000000) max_states_per_insn 1 total_states 26 peak_states 26 mark_read 6
-- END PROG LOAD LOG --
[WARNING 2025/11/07 03:39:08] libbpf: prog 'sys_exit': failed to load: -22
[WARNING 2025/11/07 03:39:08] libbpf: failed to load object 'bpf_probe'
[WARNING 2025/11/07 03:39:08] libbpf: failed to load BPF skeleton 'bpf_probe': -22
[ERROR   2025/11/07 03:39:08] libpman: failed to load BPF object (errno: 22 | message: Invalid argument)
terminate called after throwing an instance of 'sinsp_exception'
  what():  Initialization issues during scap_init
collector AbortHandler 0x8ae4d5 + 53
/lib64/libc.so.6 (null) 0x7f6d172295b0 + 0
/lib64/libc.so.6 gsignal 0x7f6d1722952f + 271
/lib64/libc.so.6 abort 0x7f6d171fce65 + 295/lib64/libstdc++.so.6 (null) 0x7f6d17bdb09b + 0
/lib64/libstdc++.so.6 (null) 0x7f6d17be154c + 0
/lib64/libstdc++.so.6 (null) 0x7f6d17be15a7 + 0
/lib64/libstdc++.so.6 (null) 0x7f6d17be1808 + 0
collector collector::KernelDriverCOREEBPF::Setup(collector::CollectorConfig const&, sinsp&) 0x926984 + 1668
collector collector::system_inspector::Service::InitKernel(collector::CollectorConfig const&) 0x923a08 + 72
collector collector::SetupKernelDriver(collector::CollectorService&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, collector::CollectorConfig const&) 0x8c0e3f + 1103
collector main 0x898095 + 517
/lib64/libc.so.6 __libc_start_main 0x7f6d172157e5 + 229
collector _start 0x8aa0ae + 46
Caught signal 6 (SIGABRT): Aborted
/bootstrap.sh: line 85:     5 Aborted                 (core dumped) eval exec "$@"

Based on the current collector pod logs, we can confirm that the collector is encountering a known issue related to recent kernel changes. This bug was addressed in RHACS Operator versions 4.5.6 and 4.6.1. Since my customer is already on version 4.8, which includes the fix, I would like to verify whether this issue is reoccurring in RHACS 4.8.

It's related to the CVE-2024-50063 [1]

As workaround there are the following options:

The method for system-level data collection: The default value is CORE_BPF. Red Hat recommends using CORE_BPF for data collection. If you select NoCollection, Collector does not report any information about the network activity and the process executions. Available options are NoCollection and CORE_BPF. If you want to stop the collectors from crashlooping you can set the collection method to NO_COLLECTION. See [2].

[1] - https://access.redhat.com/security/cve/cve-2024-50063
[2] - https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_security_for_kubernetes/4.5/html/installing/installing-rhacs-on-red-hat-openshift#per-node-settings_install-secured-cluster-config-options-ocp

We changed the SecuredCluster CR to NoCollection. However the collector pods are still on CrashLoopBackOff. The workaround doesn't work.{}

ROOT CAUSE

FIX

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates