Uploaded image for project: 'Product Technical Learning'
  1. Product Technical Learning
  2. PTL-15321

Kernel OOM Killer kills OSDs/MGRs/MONs and we need to reset labs in order to continue the class

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • CL260
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • en-US (English)
    • Critical
    • Customer Facing, Customer Reported

      Please fill in the following information:


      URL: .
      Reporter RHNID: pperaltads
      Section Title:                -                                                        

      Issue description

      Today we faced a strange issue regarding CL260 VM Labs environment. After some hours of inactivity the cluster begins to behave erraticly killing some OSDs and other Ceph cluster components in serverc, serverd and servere

      Steps to reproduce:

      1. Spawn a fresh CL260 ROL labs
      2. Create some pools with
        $ ssh admin@clienta
        $ sudo cephadm shell
        # ceph osd pool create testpool1 32 32 replicated
        # ceph osd pool create testpool2 32 32 replicated
        # ceph status
        # ceph health
      1. Let it sit for 2 days without shuting down the cluster or powering off the VMs.
      2. Verify the Ceph cluster health on clienta with 
        # ceph health

      Workaround:

      Shutdown VMs at the end of the class, start them up at the beginning of the class. If OOM crashed the cluster, the only solution is to delete and recreate the labs.

      Expected result:

      VMs do not need to be rebooted, deleted or stopped in any way during class, like other Red Hat Courses.

        1. ceph health.txt
          59 kB
          Pablo David Peralta Vendeuvre
        2. image-2025-08-28-10-21-31-548.png
          149 kB
          Patrick Gomez
        3. image-2025-08-28-10-22-14-238.png
          185 kB
          Patrick Gomez
        4. image-2025-08-28-10-23-07-911.png
          159 kB
          Patrick Gomez
        5. image-2025-08-28-13-41-01-624.png
          176 kB
          Patrick Gomez
        6. image-2025-08-28-13-42-22-202.png
          154 kB
          Patrick Gomez
        7. image-2025-08-28-13-42-50-117.png
          152 kB
          Patrick Gomez
        8. image-2025-08-28-16-03-50-357.png
          358 kB
          Pablo David Peralta Vendeuvre
        9. image-2025-08-28-16-12-28-200.png
          103 kB
          Pablo David Peralta Vendeuvre
        10. Screenshot 2025-08-28 at 09.51.27.png
          115 kB
          Ashley D’Andrea
        11. Screenshot 2025-08-28 at 13.46.30.png
          132 kB
          Ashley D’Andrea

              rht-pagomez Patrick Gomez
              cellosofia1@gmail.com Pablo David Peralta Vendeuvre (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: