Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-12034

xfs_db sometimes segfaults when run against mounted filesystems

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Undefined Undefined
    • None
    • rhel-8.6.0.z
    • xfsprogs
    • None
    • Low
    • rhel-sst-filesystems
    • ssg_filesystems_storage_and_HA
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Red Hat Enterprise Linux
    • None
    • None
    • None
    • None

      What were you trying to do that didn't work?

      When run in read-only mode against a mounted filesystem, some xfs_db commands can segfault.  The segfault is believed to be due to xfs_db encountering a changing free space btree at the same time as xfs_db is reading them.

      This is seen in customer environments because the insights-client runs several xfs_db commands against the device for mounted filesystems:

          xfs_db -r -c frag /dev/device

          xfs_db -r -c freesp /dev/device

       

      Please provide the package NVR for which bug is seen:

      xfsprogs-5.0.0-10.el8

      seen in RHEL 7 as well

      How reproducible:

      unknown, but seen in at least 3 customer cases thus far

      Steps to reproduce

      unknown

      Expected results

      no segfaults

      Actual results

      #0  __fswab16 (x=<optimized out>) at ../include/xfs_arch.h:145
      #1  process_inode (agf=0x559582375800, dip=0x600, agino=3092803) at frag.c:308
      #2  scanfunc_ino (block=0x55958237ce00, level=level@entry=0, agf=agf@entry=0x559582375800) at frag.c:513
      #3  0x00005595801e7d45 in scan_sbtree (agf=agf@entry=0x559582375800, root=3, nlevels=nlevels@entry=1, btype=TYP_INOBT, func=0x5595801e77d0 <scanfunc_ino at frag.c:461>) at frag.c:416
      #4  0x00005595801e786c in scanfunc_ino (block=0x55958237ae00, level=level@entry=1, agf=agf@entry=0x559582375800) at ../include/xfs_arch.h:158
      #5  0x00005595801e7d45 in scan_sbtree (agf=agf@entry=0x559582375800, root=1447206, nlevels=2, btype=TYP_INOBT, func=0x5595801e77d0 <scanfunc_ino at frag.c:461>) at frag.c:416
      #6  0x00005595801e7fcd in scan_ag (agno=0) at ../include/xfs_arch.h:158
      #7  frag_f (argc=<optimized out>, argv=<optimized out>) at frag.c:155
      #8  frag_f (argc=<optimized out>, argv=<optimized out>) at frag.c:145
      #9  0x00005595801d24ee in main (argc=<optimized out>, argv=<optimized out>) at init.c:195#1  process_inode (agf=0x559582375800, dip=0x600, agino=3092803) at frag.c:308
      308        switch (be16_to_cpu(dip->di_mode) & S_IFMT) {
      (gdb) p dip->di_mode
      Cannot access memory at address 0x602

      so 'dip' has an invalid value, and the segfault is due to accessing that invalid address.

      (gdb) frame 2
      #2  scanfunc_ino (block=0x55958237ce00, level=level@entry=0, agf=agf@entry=0x559582375800) at frag.c:513(gdb) list
      508                    for (j = 0; j < inodes_per_buf; j++) {
      509                        if (XFS_INOBT_IS_FREE_DISK(&rp[i], ioff + j))
      510                            continue;
      511                        dip = (xfs_dinode_t *)((char *)iocur_top->data +
      512                            ((off + j) << mp->m_sb.sb_inodelog));
      513                        process_inode(agf, agino + ioff + j, dip);
      514                    }{}(gdb) p mp->m_sb.sb_inodelog
      $15 = 9 '\t'(gdb) p iocur_top->data
      $16 = (void *) 0x0(gdb) p off
      $17 = <optimized out>(gdb) p j
      $18 = 3

      it would appear that 'off' is 0:

      (gdb) p (xfs_dinode_t *)((char *)iocur_top->data + ((0 + j) << mp->m_sb.sb_inodelog))
      $21 = (xfs_dinode_t *) 0x600

      The question is over how/why iocur_top->data is 0/NULL...  in the code, there's a test just before the loop above to make sure that iocur_top->data is specifically NOT null:

      502                    if (iocur_top->data == NULL) {    <<<<<<<<<
      503                        dbprintf(_("can't read inode block %u/%u\n"),
      504                             seqno, agbno);
      505                        goto next_buf;
      506                    }
      507    
      508                    for (j = 0; j < inodes_per_buf; j++) {
      509                        if (XFS_INOBT_IS_FREE_DISK(&rp[i], ioff + j))
      510                            continue;
      511                        dip = (xfs_dinode_t *)((char *)iocur_top->data +
      512                            ((off + j) << mp->m_sb.sb_inodelog));
      513                        process_inode(agf, agino + ioff + j, dip);
      514                    }

       

      I'll attach a coredump with prebuilt root environment/debuginfo tree

              esandeen@redhat.com Eric Sandeen
              rhn-support-fsorenso Frank Sorenson
              Eric Sandeen Eric Sandeen
              Murphy Zhou Murphy Zhou
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: