Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-173

[2303490] ceph reports PG_DEGRADED post recovering the active mds node from node drain. Ceph health never come to OK state

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • odf-4.17
    • ceph/CephFS/x86
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • ?
    • If docs needed, set a value
    • None

      Description of problem (please be detailed as possible and provide log
      snippests):

      I observed below warning in Ceph health immediately after performing node drain where active mds was running on that drained node.

      Degraded data redundancy: 1263285/8784987 objects degraded (14.380%), 1 pg degraded, 1 pg undersized

      ceph status:
      sh-5.1$ ceph status
      cluster:
      id: 994259aa-5177-4411-bb6d-5f41e6d2bde0
      health: HEALTH_WARN
      Degraded data redundancy: 1263285/8784987 objects degraded (14.380%), 1 pg degraded, 1 pg undersized

      services:
      mon: 3 daemons, quorum a,b,c (age 36m)
      mgr: a(active, since 37m), standbys: b
      mds: 1/1 daemons up, 1 hot standby
      osd: 3 osds: 3 up (since 36m), 3 in (since 5h); 1 remapped pgs

      data:
      volumes: 1/1 healthy
      pools: 4 pools, 4 pgs
      objects: 2.93M objects, 4.2 GiB
      usage: 50 GiB used, 250 GiB / 300 GiB avail
      pgs: 1263285/8784987 objects degraded (14.380%)
      3 active+clean
      1 active+undersized+degraded+remapped+backfilling

      io:
      client: 1.8 KiB/s rd, 107 KiB/s wr, 2 op/s rd, 109 op/s wr
      recovery: 2.7 KiB/s, 147 objects/s

      -------------------------------------------------------------------------------
      ceph status -w :
      ---------------

      2024-08-07T15:31:58.474864+0000 mon.a [INF] osd.0 marked itself down and dead
      2024-08-07T15:31:59.446878+0000 mon.a [WRN] Health check failed: 1 osds down (OSD_DOWN)
      2024-08-07T15:31:59.446909+0000 mon.a [WRN] Health check failed: 1 host (1 osds) down (OSD_HOST_DOWN)
      2024-08-07T15:31:59.446917+0000 mon.a [WRN] Health check failed: 1 zone (1 osds) down (OSD_ZONE_DOWN)
      2024-08-07T15:32:08.455715+0000 mon.c [INF] mon.c calling monitor election
      2024-08-07T15:32:08.463852+0000 mon.a [INF] mon.a calling monitor election
      2024-08-07T15:32:13.472954+0000 mon.a [INF] mon.a is new leader, mons a,c in quorum (ranks 0,2)
      2024-08-07T15:32:13.504953+0000 mon.a [WRN] Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)
      2024-08-07T15:32:13.507412+0000 mon.a [INF] osd.0 failed (root=default,region=us-south,zone=us-south-2,host=ocs-deviceset-1-data-0gbnf7) (connection refused reported by osd.2)
      2024-08-07T15:32:13.507697+0000 mon.a [INF] Active manager daemon a restarted
      2024-08-07T15:32:13.508133+0000 mon.a [WRN] Health check failed: 1 osds down (OSD_DOWN)
      2024-08-07T15:32:13.508154+0000 mon.a [WRN] Health check failed: 1 OSDs or CRUSH

      {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set (OSD_FLAGS)
      2024-08-07T15:32:13.508161+0000 mon.a [WRN] Health check failed: 1 host (1 osds) down (OSD_HOST_DOWN)
      2024-08-07T15:32:13.508172+0000 mon.a [WRN] Health check failed: 1 zone (1 osds) down (OSD_ZONE_DOWN)
      2024-08-07T15:32:13.508438+0000 mon.a [INF] Activating manager daemon a
      2024-08-07T15:32:13.524015+0000 mon.a [WRN] Health detail: HEALTH_WARN 1 filesystem is degraded; insufficient standby MDS daemons available; 1/3 mons down, quorum a,c
      2024-08-07T15:32:13.524031+0000 mon.a [WRN] [WRN] FS_DEGRADED: 1 filesystem is degraded
      2024-08-07T15:32:13.524037+0000 mon.a [WRN] fs ocs-storagecluster-cephfilesystem is degraded
      2024-08-07T15:32:13.524041+0000 mon.a [WRN] [WRN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available
      2024-08-07T15:32:13.524046+0000 mon.a [WRN] have 0; want 1 more
      2024-08-07T15:32:13.524050+0000 mon.a [WRN] [WRN] MON_DOWN: 1/3 mons down, quorum a,c
      2024-08-07T15:32:13.524056+0000 mon.a [WRN] mon.b (rank 1) addr v2:172.30.111.99:3300/0 is down (out of quorum)
      2024-08-07T15:32:13.546974+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:13.571798+0000 mon.a [INF] Manager daemon a is now available
      2024-08-07T15:32:14.500214+0000 mon.a [INF] Health check cleared: MDS_INSUFFICIENT_STANDBY (was: insufficient standby MDS daemons available)
      2024-08-07T15:32:15.139718+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:15.602506+0000 mon.a [WRN] Health check failed: Degraded data redundancy: 2122080/6366240 objects degraded (33.333%), 4 pgs degraded (PG_DEGRADED)
      2024-08-07T15:32:15.725995+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:16.713176+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:17.720229+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:18.754681+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:19.759228+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:20.821000+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:21.829013+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:21.846112+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2129714/6389142 objects degraded (33.333%), 4 pgs degraded (PG_DEGRADED)
      2024-08-07T15:32:22.854516+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:23.858571+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:24.913407+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:25.919793+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:26.969597+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:27.970493+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-b restarted
      2024-08-07T15:32:30.343628+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2129618/6388854 objects degraded (33.333%), 4 pgs degraded (PG_DEGRADED)
      2024-08-07T15:32:34.242144+0000 mon.a [INF] daemon mds.ocs-storagecluster-cephfilesystem-a is now active in filesystem ocs-storagecluster-cephfilesystem as rank 0
      2024-08-07T15:32:34.622939+0000 mon.a [INF] Health check cleared: FS_DEGRADED (was: 1 filesystem is degraded)
      2024-08-07T15:32:35.347915+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2129053/6387159 objects degraded (33.333%), 4 pgs degraded (PG_DEGRADED)
      2024-08-07T15:32:40.350852+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2128471/6385413 objects degraded (33.333%), 4 pgs degraded (PG_DEGRADED)
      2024-08-07T15:32:42.287909+0000 mon.b [INF] mon.b calling monitor election
      2024-08-07T15:32:42.293112+0000 mon.a [INF] mon.a calling monitor election
      2024-08-07T15:32:42.301939+0000 mon.a [INF] mon.a is new leader, mons a,b,c in quorum (ranks 0,1,2)
      2024-08-07T15:32:42.317047+0000 mon.a [INF] Health check cleared: MON_DOWN (was: 1/3 mons down, quorum a,c)
      2024-08-07T15:32:42.328443+0000 mon.a [WRN] Health detail: HEALTH_WARN 1 osds down; 1 OSDs or CRUSH {nodes, device-classes}

      have

      {NOUP,NODOWN,NOIN,NOOUT} flags set; 1 host (1 osds) down; 1 zone (1 osds) down; Degraded data redundancy: 2128399/6385197 objects degraded (33.333%), 4 pgs degraded
      2024-08-07T15:32:42.328475+0000 mon.a [WRN] [WRN] OSD_DOWN: 1 osds down
      2024-08-07T15:32:42.328483+0000 mon.a [WRN] osd.0 (root=default,region=us-south,zone=us-south-2,host=ocs-deviceset-1-data-0gbnf7) is down
      2024-08-07T15:32:42.328488+0000 mon.a [WRN] [WRN] OSD_FLAGS: 1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT}

      flags set
      2024-08-07T15:32:42.328495+0000 mon.a [WRN] zone us-south-2 has flags noout
      2024-08-07T15:32:42.328505+0000 mon.a [WRN] [WRN] OSD_HOST_DOWN: 1 host (1 osds) down
      2024-08-07T15:32:42.328511+0000 mon.a [WRN] host ocs-deviceset-1-data-0gbnf7 (root=default,region=us-south,zone=us-south-2) (1 osds) is down
      2024-08-07T15:32:42.328525+0000 mon.a [WRN] [WRN] OSD_ZONE_DOWN: 1 zone (1 osds) down
      2024-08-07T15:32:42.328539+0000 mon.a [WRN] zone us-south-2 (root=default,region=us-south) (1 osds) is down
      2024-08-07T15:32:42.328554+0000 mon.a [WRN] [WRN] PG_DEGRADED: Degraded data redundancy: 2128399/6385197 objects degraded (33.333%), 4 pgs degraded
      2024-08-07T15:32:42.328575+0000 mon.a [WRN] pg 1.0 is active+undersized+degraded, acting [2,1]
      2024-08-07T15:32:42.328584+0000 mon.a [WRN] pg 2.0 is active+undersized+degraded, acting [1,2]
      2024-08-07T15:32:42.328590+0000 mon.a [WRN] pg 3.0 is active+undersized+degraded, acting [2,1]
      2024-08-07T15:32:42.328615+0000 mon.a [WRN] pg 4.0 is active+undersized+degraded, acting [1,2]
      2024-08-07T15:32:46.365040+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2128694/6386082 objects degraded (33.333%), 4 pgs degraded (PG_DEGRADED)
      2024-08-07T15:32:51.582806+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2133105/6399315 objects degraded (33.333%), 4 pgs degraded (PG_DEGRADED)
      2024-08-07T15:33:00.365860+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2132468/6397404 objects degraded (33.333%), 4 pgs degraded (PG_DEGRADED)
      2024-08-07T15:33:05.369169+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2131774/6395322 objects degraded (33.333%), 4 pgs degraded (PG_DEGRADED)
      2024-08-07T15:33:07.745128+0000 mon.a [INF] Health check cleared: OSD_DOWN (was: 1 osds down)
      2024-08-07T15:33:07.745153+0000 mon.a [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (1 osds) down)
      2024-08-07T15:33:07.745179+0000 mon.a [INF] Health check cleared: OSD_ZONE_DOWN (was: 1 zone (1 osds) down)
      2024-08-07T15:33:07.788895+0000 mon.a [INF] osd.0 [v2:10.131.0.39:6800/2123740462,v1:10.131.0.39:6801/2123740462] boot
      2024-08-07T15:33:07.002817+0000 osd.0 [WRN] OSD bench result of 29015.942307 IOPS exceeded the threshold limit of 500.000000 IOPS for osd.0. IOPS capacity is unchanged at 315.000000 IOPS. The recommendation is to establish the osd's IOPS capacity using other benchmark tools (e.g. Fio) and then override osd_mclock_max_capacity_iops_[hdd|ssd].
      2024-08-07T15:33:10.375207+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2144800/6434400 objects degraded (33.333%), 4 pgs degraded (PG_DEGRADED)
      2024-08-07T15:33:11.878638+0000 mon.a [INF] Health check cleared: PG_DEGRADED (was: Degraded data redundancy: 2144800/6434400 objects degraded (33.333%), 4 pgs degraded)
      2024-08-07T15:33:16.388746+0000 mon.a [WRN] Health check failed: Degraded data redundancy: 2147287/6445536 objects degraded (33.314%), 2 pgs degraded (PG_DEGRADED)
      2024-08-07T15:33:22.559904+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2146906/6444624 objects degraded (33.313%), 2 pgs degraded (PG_DEGRADED)
      2024-08-07T15:33:30.388359+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2155367/6470181 objects degraded (33.312%), 2 pgs degraded (PG_DEGRADED)
      2024-08-07T15:33:35.391672+0000 mon.a [WRN] Health check update: Degraded data redundancy: 2168686/6510342 objects degraded (33.311%), 1 pg degraded (PG_DEGRADED)

      Version of all relevant components (if applicable):

      ceph version 18.2.1-229.el9cp (ef652b206f2487adfc86613646a4cac946f6b4e0) reef (stable)
      ocp: 4.17.0-0.nightly-2024-08-06-235322
      odf: 4.17.0-65.stable

      Does this issue impact your ability to continue to work with the product
      (please explain in detail what is the user impact)?
      yes--> Impacting automation.

      Is there any workaround available to the best of your knowledge?

      Rate from 1 - 5 the complexity of the scenario you performed that caused this
      bug (1 - very simple, 5 - very complex)?
      1

      Can this issue reproducible?
      Yes

      Can this issue reproduce from the UI?

      If this is a regression, please provide more details to justify this:

      Steps to Reproduce:
      1. Run IO on PVCs created using ceph filesystem
      2. When Io consumes more memory in active MDS, please perform node drain where mds is running.
      3. Ceph health will give warning about pg degraded and it will be in the same state forever though mds up and running. All pods are running fine.

      Actual results:
      Observed PG_DEGRADED warnings in ceph health after node drain.

      Expected results:
      Ceph health should be OK when all pod are up and running even after node drain.

      Additional info:

              vshankar@redhat.com Venky Shankar
              rhn-support-nagreddy Nagendra Reddy
              Nagendra Reddy, Venky Shankar
              Elad Ben Aharon Elad Ben Aharon
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated: