-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
rhel-8.2.0
-
None
-
Moderate
-
rhel-gpuaccelerators-gpu
-
None
-
False
-
False
-
-
None
-
None
-
None
-
None
-
If docs needed, set a value
-
-
All
-
None
-
57,005
User had a mwm lock at 100% cpu usage issue since rhel6.
We found it was in a infinite loop in WmWinList.c:AddEntryToList()
The list is not circular, and has a prevSibling and nextSibling pointer.
Customer was running the attached patch for a long time, and now the
problem happened again, but this time the patch worked:
(gdb) bt
#0 0x00007fe08531370f in raise () from /lib64/libc.so.6
#1 0x00007fe0852fdb25 in abort () from /lib64/libc.so.6
#2 0x00005638d1871945 in WrapSetNextSibling (pEntry=<optimized out>,
nextSibling=<optimized out>) at WmWinList.c:81
#3 0x00005638d1897ec1 in WrapSetNextSibling (nextSibling=<optimized out>,
pEntry=0x5638d384b828) at WmWinList.c:76
#4 AddEntryToList (pWS=<optimized out>, pEntry=0x5638d384b828,
onTop=<optimized out>, pStackEntry=0x0) at WmWinList.c:317
#5 0x00005638d187dbac in Do_Raise (pCD=pCD@entry=0x5638d384b790,
pStackEntry=pStackEntry@entry=0x0, flags=flags@entry=0)
at WmFunction.c:4205
#6 0x00005638d18861fb in SetKeyboardFocus (pCD=pCD@entry=0x5638d384b790,
focusFlags=focusFlags@entry=2) at WmKeyFocus.c:304
#7 0x00005638d1876e65 in HandleCFocusIn (pCD=0x5638d384b790,
focusChangeEvent=focusChangeEvent@entry=0x7ffdf220d820) at WmCEvent.c:2112
#8 0x00005638d1877a5b in WmDispatchClientEvent (event=0x7ffdf220d820)
at WmCEvent.c:374
#9 0x00005638d1871a50 in main (argc=<optimized out>, argv=<optimized out>,
environ=0x7ffdf220da10) at WmMain.c:221
The WrapSetNextSibling was the added patch to detect when a circular list
is created. It calls abort if it happens.
Checking backtrace we see:
(gdb) f 5
#5 0x00005638d187dbac in Do_Raise (pCD=pCD@entry=0x5638d384b790,
pStackEntry=pStackEntry@entry=0x0, flags=flags@entry=0)
at WmFunction.c:4205
4205 MoveEntryInList (pWS, &pcdLeader->clientEntry,
(gdb) list
4200 {
4201 if (ACTIVE_PSD->clientList != &pcdLeader->clientEntry)
4202
4208 }
4209 }
and checking MoveEntryInList we see:
(gdb) list MoveEntryInList
403 * pWS = (clientList, lastClient)
404 *
405 ************************************<->**********************************/
406
407 void MoveEntryInList (WmWorkspaceData *pWS, ClientListEntry *pEntry, Boolean onTop, ClientListEntry *pStackEntry)
408
/* END OF FUNCTION MoveEntryInList */
So, the issue should be a problem in DeleteEntryFromList, that somehow did not
delete the entry, or related problem:
(gdb) list DeleteEntryFromList
435 * pWS = (clientList, lastClient)
436 *
437 ************************************<->**********************************/
438
439 void DeleteEntryFromList (WmWorkspaceData *pWS, ClientListEntry *pListEntry)
440 {
441
442 if (pListEntry->prevSibling)
443
447 else
448
451
452 if (pListEntry->nextSibling)
453
456 else
457
460
461 } /* END OF FUNCTION DeleteEntryFromList */
Form my understanding, the actual problem is that the pListEntry nextSibling and
prevSibling fields are not set to NULL when the entry is removed.Then, in the AddEntryToList
call if checks if pEntry->prevSibling is not NULL, and if that is the case, update
the prevSibiling field.
A general fix could be in DeleteEntryFromList to add:
pListEntry->prevSibling = pListEntry->nextSibling = NULL;
at the end of the function. This way, when the entry is added back, it does not
create a circular list, and could prevent similar bugs.
An specialized patch, just for the condition in the backtrace above should be
in AddEntryToList():
314 if (pSD->clientList != pEntry)
315 {
316 /pEntry->nextSibling = pSD->clientList;/
317 WrapSetNextSibling(pEntry, pSD->clientList);
318 pEntry->prevSibling = NULL;
319 if (pSD->clientList)
to first set prevSibling to NULL, and only then set the nextSibling.
The problem is an out of order list insertion/deletion, creating a
circular list.
- external trackers