-
Bug
-
Resolution: Done
-
Critical
-
1.8
-
None
-
None
While using agroal connection pool, we discovered some rare deadlock, which are causing 100% cpu on some threads. These deadlocks occur in the StampedCopyOnWriteArrayList class, when there is more than one thread trying to remove the same object.
A simple reproducer in junit (fails nearly every time on my machine):
@Test public void testThis() { ExecutorService service = Executors.newFixedThreadPool(10); StampedCopyOnWriteArrayList<Object> list = new StampedCopyOnWriteArrayList<>(Object.class); Object o = new Object(); list.add(new Object()); list.add(new Object()); list.add(new Object()); list.add(new Object()); list.add(o); list.add(new Object()); List<Runnable> runnerList = new ArrayList<>(10); List<Future> futureList = new ArrayList<>(10); for (int i = 0; i < 10; i++) { runnerList.add(new Runnable() { @Override public void run() { list.remove(o); System.out.println("Removed success!"); } }); } for (Runnable r : runnerList) { futureList.add(service.submit(r)); } for (Future r : futureList) { try { r.get(10000, TimeUnit.MILLISECONDS); } catch (InterruptedException e) { e.printStackTrace(); } catch (ExecutionException e) { e.printStackTrace(); } catch (TimeoutException e) { System.out.println("Seems like we have a deadlock!"); } } }
Originally this deadlock seems to occur, when agroal tries to flush a connection due to the config parameter
<property name="hibernate.agroal.maxLifetime_m">60</property>
If at the same time another thread using this connection calls session.close there is a possibility in the ConnectionPool.class getting called twice. The parameter goes through the following path:
The parallel session.close call does not find a checked_out connection and tries to flush it instead, hence two Threads are getting into the deadlock situation:
Kind regards,
Rene