-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
4.12.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
No
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Performing the steps to produce a crash in the kernel does not generate the expected KDUMP files. SNO installed ZTP and applying a Telco Profile.
Version-Release number of selected component (if applicable):
Issue seen in 4.12.25 and 4.12.26. ztp-site-generator image: http://registry.redhat.io/openshift4/ztp-site-generate-rhel8:4.12.3 and also tested with http://registry.redhat.io/openshift4/ztp-site-generate-rhel8:4.12.1
How reproducible:
100%
Steps to Reproduce:
1. Install SNO cluster version 4.12.25 2. Ensure sysrq is configured with value=1: echo 1 > /proc/sys/kernel/sysrq 3. echo c > /proc/sysrq-trigger 4. Wait for node to recover
Actual results:
/var/crash directory empty
Expected results:
/var/crash directory has core dump files such as: vmcore-dmesg.txt, vmcore, kexec-dmesg.log
Additional info:
System impact: In case of failure in the platform, not all the important data can be recovered. Apart from that, the node can work
The node has the right MachineConfigs after policies being installed:
```
oc get -o yaml machineconfig/06-kdump-enable-master
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
annotations:
ran.openshift.io/ztp-gitops-generated: '{}'
creationTimestamp: "2023-07-26T07:51:30Z"
generation: 1
labels:
machineconfiguration.openshift.io/role: master
name: 06-kdump-enable-master
resourceVersion: "1595"
uid: 1eeaf075-4d1b-4540-92f3-d1db167fe1d5
spec:
config:
ignition:
version: 3.2.0
systemd:
units:
- enabled: true
name: kdump.service
kernelArguments:
- crashkernel=512M
oc get -o yaml machineconfig/06-kdump-enable-worker
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
annotations:
ran.openshift.io/ztp-gitops-generated: '{}'
creationTimestamp: "2023-07-26T07:51:30Z"
generation: 1
labels:
machineconfiguration.openshift.io/role: worker
name: 06-kdump-enable-worker
resourceVersion: "1596"
uid: ddad84a5-3ad3-48dc-9096-9346795f228d
spec:
config:
ignition:
version: 3.2.0
systemd:
units:
- enabled: true
name: kdump.service
kernelArguments:
- crashkernel=512M
```
Kdump service is running:
```
[core@cloudransno-site3 ~]$ sudo systemctl status kdump.service
● kdump.service - Crash recovery kernel arming
Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: disabled)
Active: active (exited) since Wed 2023-07-26 09:19:03 UTC; 23min ago
Main PID: 5425 (code=exited, status=0/SUCCESS)
Tasks: 0 (limit: 818202)
Memory: 0B
CPU: 0
CGroup: /system.slice/kdump.service
Jul 26 09:19:03 cloudransno-site3 kdumpctl[5428]: kdump: kexec: loaded kdump kernel
Jul 26 09:19:03 cloudransno-site3 kdumpctl[5428]: kdump: Starting kdump: [OK]
Jul 26 09:19:02 cloudransno-site3 systemd[1]: Starting Crash recovery kernel arming...
Jul 26 09:19:03 cloudransno-site3 systemd[1]: Started Crash recovery kernel arming.
```