Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-132689

Apparent regression introduced by mesa 25.0.7 on RHEL 9.7 breaks GNOME on Xorg on Intel TigerLake GPU

Linking RHIVOS CVEs to...Migration: Automation ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • rhel-9.7
    • mesa
    • None
    • Yes
    • Low
    • rhel-gpuaccelerators-gpu
    • None
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • None

      What were you trying to do that didn't work?

      On some but not all machines updated from RHEL 9.6 to RHEL 9.7, logging into GNOME appears to freeze after displaying the top bar and before opening up the activities overview. The mouse pointer still moves, but the buttons displayed do nothing and the clock does not update.

      The issue only affects Xorg, not Wayland.

      Downgrading the Mesa packages back to the RHEL 9.6 ones (version 24.2.8-3) makes the problem go away.

      Setting DISPLAY and running glxinfo reveals that the hung session is using llvmpipe:

      OpenGL renderer string: llvmpipe (LLVM 20.1.8, 256 bits)

      Whereas after downgrading to older mesa it gives:

      OpenGL renderer string: Mesa Intel(R) Xe Graphics (TGL GT2)

      What is the impact of this issue to you?

      Customer would like to upgrade to RHEL 9.7 as soon as possible, but this is causeing a delay. Some laptops may be physically far away, and if the updates make them inoperable it may not necessarily be possible to access them remotely to provide relief.

      Please provide the package NVR for which the bug is seen:

      • mesa-dri-drivers-25.0.7-3.el9_7.x86_64
      • mesa-filesystem-25.0.7-3.el9_7.x86_64
      • mesa-libEGL-25.0.7-3.el9_7.x86_64
      • mesa-libGL-25.0.7-3.el9_7.x86_64
      • mesa-libgbm-25.0.7-3.el9_7.x86_64
      • mesa-libxatracker-25.0.7-3.el9_7.x86_64
      • mesa-vulkan-drivers-25.0.7-3.el9_7.x86_64

      How reproducible is this bug?:

      Always

      Steps to reproduce

      1. Install RHEL 9.7 on a machine with TigerLake-LP GT2
      2. Log in to GNOME

      Expected results

      No freeze

      Actual results

      gnome-shell crashes

      Additional information

      This is the back-trace from the gnome-shell core dump:

       

      (gdb) bt
      #0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
      #1  0x00007f9e6d68d093 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
      #2  0x00007f9e6d63fb86 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
      #3  0x0000564deef1e242 in dump_gjs_stack_on_signal_handler (signo=6) at ../src/main.c:357
      #4  <signal handler called>
      #5  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
      #6  0x00007f9e6d68d093 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
      #7  0x00007f9e6d63fb86 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
      #8  0x00007f9e6d629873 in __GI_abort () at abort.c:79
      #9  0x00007f9e6d62979b in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x7f9e6d574e98 "!xcb_xlib_threads_sequence_lost", file=file@entry=0x7f9e6d574ae0 "xcb_io.c", line=line@entry=269, 
          function=function@entry=0x7f9e6d575478 <__PRETTY_FUNCTION__.7> "poll_for_event") at assert.c:123
      #10 0x00007f9e6d6388c6 in __assert_fail (assertion=assertion@entry=0x7f9e6d574e98 "!xcb_xlib_threads_sequence_lost", file=file@entry=0x7f9e6d574ae0 "xcb_io.c", line=line@entry=269, 
          function=function@entry=0x7f9e6d575478 <__PRETTY_FUNCTION__.7> "poll_for_event") at assert.c:132
      #11 0x00007f9e6d500b77 in poll_for_event (dpy=dpy@entry=0x564def784000, queued_only=0) at /usr/src/debug/libX11-1.7.0-11.el9.x86_64/src/xcb_io.c:269
      #12 0x00007f9e6d500c18 in poll_for_response (dpy=dpy@entry=0x564def784000) at /usr/src/debug/libX11-1.7.0-11.el9.x86_64/src/xcb_io.c:301
      #13 0x00007f9e6d503d02 in _XEventsQueued (mode=<optimized out>, dpy=0x564def784000) at /usr/src/debug/libX11-1.7.0-11.el9.x86_64/src/xcb_io.c:432
      #14 _XEventsQueued (dpy=0x564def784000, mode=mode@entry=1) at /usr/src/debug/libX11-1.7.0-11.el9.x86_64/src/xcb_io.c:414
      #15 0x00007f9e6d503d82 in _XFlush (dpy=<optimized out>) at /usr/src/debug/libX11-1.7.0-11.el9.x86_64/src/xcb_io.c:602
      #16 0x00007f9e6d503fed in _XGetRequest (dpy=0x564def784000, type=<optimized out>, len=4) at /usr/src/debug/libX11-1.7.0-11.el9.x86_64/src/XlibInt.c:1787
      #17 0x00007f9e6d4f86bd in XSync (dpy=0x564def784000, discard=discard@entry=0) at /usr/src/debug/libX11-1.7.0-11.el9.x86_64/src/Sync.c:43
      #18 0x00007f9e6dc5bc1a in cogl_onscreen_glx_bind (onscreen=<optimized out>) at ../cogl/cogl/winsys/cogl-onscreen-glx.c:335
      #19 0x00007f9e6dc25dd5 in cogl_onscreen_bind (onscreen=<optimized out>) at ../cogl/cogl/cogl-onscreen.c:300
      #20 cogl_gl_framebuffer_back_bind (gl_framebuffer=<optimized out>, target=36160) at ../cogl/cogl/driver/gl/cogl-gl-framebuffer-back.c:189
      #21 0x00007f9e6dc2706a in cogl_gl_framebuffer_bind (target=36160, gl_framebuffer=0x564defb3cd50) at ../cogl/cogl/driver/gl/cogl-framebuffer-gl.c:252
      #22 _cogl_driver_gl_flush_framebuffer_state (ctx=0x564defb1bbe0, draw_buffer=<optimized out>, read_buffer=<optimized out>, 
          state=(COGL_FRAMEBUFFER_STATE_BIND | COGL_FRAMEBUFFER_STATE_VIEWPORT | COGL_FRAMEBUFFER_STATE_CLIP | COGL_FRAMEBUFFER_STATE_DITHER | COGL_FRAMEBUFFER_STATE_MODELVIEW | COGL_FRAMEBUFFER_STATE_PROJECTION | COGL_FRAMEBUFFER_STATE_FRONT_FACE_WINDING | COGL_FRAMEBUFFER_STATE_DEPTH_WRITE | COGL_FRAMEBUFFER_STATE_STEREO_MODE)) at ../cogl/cogl/driver/gl/cogl-util-gl.c:251
      #23 0x00007f9e6dc5d19d in cogl_context_flush_framebuffer_state (
          state=(COGL_FRAMEBUFFER_STATE_BIND | COGL_FRAMEBUFFER_STATE_VIEWPORT | COGL_FRAMEBUFFER_STATE_CLIP | COGL_FRAMEBUFFER_STATE_DITHER | COGL_FRAMEBUFFER_STATE_MODELVIEW | COGL_FRAMEBUFFER_STATE_PROJECTION | COGL_FRAMEBUFFER_STATE_FRONT_FACE_WINDING | COGL_FRAMEBUFFER_STATE_DEPTH_WRITE | COGL_FRAMEBUFFER_STATE_STEREO_MODE), read_buffer=0x564defb4c1a0, draw_buffer=0x564defb4c1a0, ctx=0x564defb1bbe0) at ../cogl/cogl/cogl-framebuffer.c:1127
      #24 cogl_framebuffer_clear4f (framebuffer=0x564defb4c1a0, buffers=2, red=1, green=1, green@entry=3.0958887e-41, blue=1, blue@entry=-4.15291648e+30, alpha=1, alpha@entry=3.0958887e-41) at ../cogl/cogl/cogl-framebuffer.c:605
      #25 0x00007f9e6dc5edee in cogl_framebuffer_clear (framebuffer=<optimized out>, buffers=<optimized out>, color=color@entry=0x564df251ab84) at ../cogl/cogl/cogl-framebuffer.c:663
      #26 0x00007f9e6de47464 in clutter_root_node_pre_draw (node=0x564df251ab30, paint_context=<optimized out>) at ../clutter/clutter/clutter-paint-nodes.c:113
      #27 0x00007f9e6de4afe9 in clutter_paint_node_paint (node=0x564df251ab30, paint_context=0x564df1045df0) at ../clutter/clutter/clutter-paint-node.c:1076
      #28 0x00007f9e6de4b013 in clutter_paint_node_paint (node=0x564df1ea9800, paint_context=0x564df1045df0) at ../clutter/clutter/clutter-paint-node.c:1087
      #29 0x00007f9e6de0d26c in clutter_actor_paint_node (paint_context=0x564df1045df0, root=0x564df1ea9800, actor=0x564defb49980) at ../clutter/clutter/clutter-actor.c:3606
      #30 clutter_actor_continue_paint (self=0x564defb49980, paint_context=0x564df1045df0) at ../clutter/clutter/clutter-actor.c:3872
      #31 0x00007f9e6de4affa in clutter_paint_node_paint (node=0x564df1667760, paint_context=0x564df1045df0) at ../clutter/clutter/clutter-paint-node.c:1080
      #32 0x00007f9e6de4b013 in clutter_paint_node_paint (node=0x564df18f8430, paint_context=0x564df1045df0) at ../clutter/clutter/clutter-paint-node.c:1087
      #33 0x00007f9e6de0c713 in clutter_actor_paint (self=0x564defb49980, paint_context=0x564df1045df0) at ../clutter/clutter/clutter-actor.c:3816
      #34 0x00007f9e6de63905 in clutter_stage_do_paint_view (stage=0x564defb49980, view=0x564defb4e0f0, redraw_clip=0x564df2521170) at ../clutter/clutter/clutter-stage.c:490
      #35 0x00007f9e6da8837e in meta_stage_paint_view (stage=0x564defb49980, view=0x564defb4e0f0, redraw_clip=0x564df2521170) at ../src/backends/meta-stage.c:259
      #36 0x00007f9e6de8b95f in clutter_stage_paint_view (redraw_clip=0x564df2521170, view=0x564defb4e0f0, stage=0x564defb49980) at ../clutter/clutter/clutter-stage.c:513
      #37 paint_stage.isra.0 (view=view@entry=0x564defb4e0f0, redraw_clip=redraw_clip@entry=0x564df2521170, stage_cogl=<optimized out>, stage_cogl=<optimized out>) at ../clutter/clutter/cogl/clutter-stage-cogl.c:414
      #38 0x00007f9e6de84166 in clutter_stage_cogl_redraw_view_primary (frame=0x7fffba429710, view=0x564defb4e0f0, stage_cogl=0x564defb43990) at ../clutter/clutter/cogl/clutter-stage-cogl.c:620
      #39 clutter_stage_cogl_redraw_view (stage_window=<optimized out>, view=0x564defb4e0f0, frame=0x7fffba429710) at ../clutter/clutter/cogl/clutter-stage-cogl.c:741
      #40 0x00007f9e6da9b002 in meta_stage_x11_redraw_view (stage_window=<optimized out>, view=<optimized out>, frame=0x7fffba429710) at ../src/backends/x11/meta-stage-x11.c:498
      #41 0x00007f9e6de68526 in _clutter_stage_window_redraw_view (frame=0x7fffba429710, view=0x564defb4e0f0, window=0x564defb43990) at ../clutter/clutter/clutter-stage-window.c:113
      #42 handle_frame_clock_frame (frame_clock=<optimized out>, frame_count=<optimized out>, time_us=<optimized out>, user_data=0x564defb4e0f0) at ../clutter/clutter/clutter-stage-view.c:1188
      #43 0x00007f9e6de32eae in clutter_frame_clock_dispatch (time_us=3348046236, frame_clock=0x564defb42820) at ../clutter/clutter/clutter-frame-clock.c:530
      #44 frame_clock_source_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../clutter/clutter/clutter-frame-clock.c:570
      #45 0x00007f9e6e943f4f in g_main_dispatch (context=0x564def776180) at ../glib/gmain.c:3364
      #46 g_main_context_dispatch (context=0x564def776180) at ../glib/gmain.c:4079
      #47 0x00007f9e6e999268 in g_main_context_iterate.constprop.0 (context=0x564def776180, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4155
      #48 0x00007f9e6e9435a3 in g_main_loop_run (loop=0x564defb70400) at ../glib/gmain.c:4353
      #49 0x00007f9e6dac7400 in meta_run_main_loop () at ../src/core/main.c:937
      #50 0x00007f9e6dacf912 in meta_run () at ../src/core/main.c:952
      #51 0x0000564deef1dc90 in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.c:512

       

      So it is dying due to an assertion failure (!xcb_xlib_threads_sequence_lost).

      The support engineer who is working on the customer case discussed the problem with the  desktop developers and that's a symptom of a threading issue in Mesa. It happens because threads should have been enabled earlier. Upgrading the libX11 packages must solve the problem, since libX11 version 1.8 and later enables threads at load time (instead of requiring apps to explicitly enable threads).

      The engineer asked the customer to test using libX11-1.8.12-1.el9. With that update, gnome-shell does not crash but gets frozen. A core dump of gnome-shell obtained via gcore shows this back-trace:

       

      (gdb) thr 10
      [Switching to thread 10 (Thread 0x7f7933fff640 (LWP 113961))]
      #0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, 
          futex_word=0x55c34e9af228) at futex-internal.c:57
      57        return INTERNAL_SYSCALL_CANCEL (futex_time64, futex_word, op, expected,
      (gdb) bt
      #0  __futex_abstimed_wait_common64
          (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55c34e9af228)
          at futex-internal.c:57
      #1  __futex_abstimed_wait_common
          (futex_word=futex_word@entry=0x55c34e9af228, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at futex-internal.c:87
      #2  0x00007f797d8883df in __GI___futex_abstimed_wait_cancelable64
          (futex_word=futex_word@entry=0x55c34e9af228, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at futex-internal.c:139
      #3  0x00007f797d88a8d2 in __pthread_cond_wait_common
          (abstime=0x0, clockid=0, mutex=0x55c34e9af1d0, cond=0x55c34e9af200) at pthread_cond_wait.c:427
      #4  ___pthread_cond_wait (cond=0x55c34e9af200, mutex=0x55c34e9af1d0) at pthread_cond_wait.c:459
      #5  0x00007f796070b22b in cnd_wait (mtx=0x55c34e9af1d0, cond=0x55c34e9af200)
          at ../src/c11/impl/threads_posix.c:111
      #6  util_queue_thread_func (input=input@entry=0x55c34eaf6040) at ../src/util/u_queue.c:275
      #7  0x00007f79606fb9bb in impl_thrd_routine (p=<optimized out>) at ../src/c11/impl/threads_posix.c:43
      #8  0x00007f797d88b2ea in start_thread (arg=<optimized out>) at pthread_create.c:443
      #9  0x00007f797d9103c0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
      

      Comment from another support engineer who analyzed the core dump:

      It looks like there's some C11 POSIX code in /usr/src/debug/mesa-25.0.7-3.el9_7.x86_64/src/c11/impl/threads_posix.c. I wonder if this was code that was included in the last update to Mesa because it appears from the rest of the backtrace that 
      gnome-shell is in self deadlock.

       

              rhn-engineering-airlied David Airlie
              rhn-support-casantos Carlos Santos
              Ione Rabbit
              David Airlie David Airlie
              Ione Rabbit Ione Rabbit
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: