-
Bug
-
Resolution: Done-Errata
-
Undefined
-
rhel-9.3.0
-
rasdaemon-0.6.7-11.el9
-
None
-
None
-
3
-
rhel-sst-kernel-ft
-
ssg_core_kernel
-
13
-
17
-
2
-
QE ack, Dev ack
-
False
-
-
None
-
CK-June-2024, CK-July-2024, CK-August-2024
-
If docs needed, set a value
-
-
Unspecified
-
None
Description of problem:
rasdaemon spews out fairly incomprehensible diskerror_eventstore messages.
Version-Release number of selected component (if applicable):
rasdaemon-0.6.4-6.el9.x86_64
How reproducible:
I'm not sure.
Steps to Reproduce:
1. Boot system and run 'journalctl -b -u rasdaemon'
2.
3.
Actual results:
Mar 15 16:13:23 ti26 rasdaemon[2399]: <idle>-0 [043] 0.000002: block_rq_complete: 2023-03-15 16:13:23 -0400
Mar 15 16:13:24 ti26 rasdaemon[2399]: rasdaemon: diskerror_eventstore: 0x5628c0e93318
Mar 15 16:13:24 ti26 rasdaemon[2399]: rasdaemon: register inserted at db
Mar 15 16:13:24 ti26 rasdaemon[2399]: <idle>-0 [037] 0.000002: block_rq_complete: 2023-03-15 16:13:24 -0400
Mar 15 16:13:26 ti26 rasdaemon[2399]: rasdaemon: diskerror_eventstore: 0x5628c0e93318
Mar 15 16:13:26 ti26 rasdaemon[2399]: rasdaemon: register inserted at db
Mar 15 16:13:26 ti26 rasdaemon[2399]: <idle>-0 [037] 0.000003: block_rq_complete: 2023-03-15 16:13:26 -0400
Mar 15 16:13:28 ti26 rasdaemon[2399]: rasdaemon: diskerror_eventstore: 0x5628c0e93318
Mar 15 16:13:28 ti26 rasdaemon[2399]: rasdaemon: register inserted at db
Mar 15 16:13:28 ti26 rasdaemon[2399]: <idle>-0 [037] 0.000003: block_rq_complete: 2023-03-15 16:13:28 -0400
Mar 15 16:13:30 ti26 rasdaemon[2399]: rasdaemon: diskerror_eventstore: 0x5628c0e93318
Mar 15 16:13:30 ti26 rasdaemon[2399]: rasdaemon: register inserted at db
Mar 15 16:13:30 ti26 rasdaemon[2399]: <idle>-0 [037] 0.000003: block_rq_complete: 2023-03-15 16:13:30 -0400
Mar 15 16:13:32 ti26 rasdaemon[2399]: rasdaemon: diskerror_eventstore: 0x5628c0e93318
Mar 15 16:13:32 ti26 rasdaemon[2399]: rasdaemon: register inserted at db
...
Expected results:
I have no clue what this signifies. It's a very unhelpful message. What does
it mean? Do I have a problem with one or more disks?
Additional info:
bash-5.1$ ras-mc-ctl --errors | tail
9255 2023-03-15 16:18:39 -0400 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
9256 2023-03-15 16:18:41 -0400 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
9257 2023-03-15 16:18:43 -0400 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
9258 2023-03-15 16:18:45 -0400 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
9259 2023-03-15 16:18:47 -0400 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
9260 2023-03-15 16:18:49 -0400 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
9261 2023-03-15 16:18:51 -0400 error: dev=0:2816, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
No MCE errors.
What's this trying to tell me? I thought rasdaemon's job was to detect memory problems.
Also, the "ras-mc-ctl --errors" option is not documented in the man page. Some others are missing as well.
- external trackers
- links to
-
RHBA-2024:138015 rasdaemon bug fix and enhancement update