Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-4801

OpenWire protocol should use ConcurrentHashMap instead of HashMap to avoid potential performance degradation

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Major Major
    • None
    • AMQ 7.8.0.GA
    • openwire-protocol
    • None
    • False
    • False
    • Undefined

      The OpenWireProtocolManager class uses HashMap for maintaining all active connections relating to clientId:

      104    private final Map<String, AMQConnectionContext> clientIdSet = new HashMap<>(); 

      both "removeConnection" method and "addConnection" method are in synchronized block in order to provide thread-safe read/write to this HashMap.

      172    public void removeConnection(ConnectionInfo info, Throwable error) throws InvalidClientIDException {
      173       synchronized (clientIdSet) {
      174          String clientId = info.getClientId();
      175          if (clientId != null) {
      176             AMQConnectionContext context = this.clientIdSet.get(clientId);
      177             if (context != null && context.decRefCount() == 0) {
      178                //connection is still there and need to close
      179                context.getConnection().disconnect(error != null);
      180                this.connections.remove(context.getConnection());
      181                this.clientIdSet.remove(clientId);
      182             }
      183          } else {
      184             throw new InvalidClientIDException("No clientID specified for connection disconnect request");
      185          }
      186       }
      187    } 

      and 

      295    public void addConnection(OpenWireConnection connection, ConnectionInfo info) throws Exception {
      ...
      ...
      313       synchronized (clientIdSet) {
      314          AMQConnectionContext context;
      315          context = clientIdSet.get(clientId);
      316          if (context != null) {
      317             if (info.isFailoverReconnect()) {
      318                OpenWireConnection oldConnection = context.getConnection();
      319                oldConnection.disconnect(true);
      320                connections.remove(oldConnection);
      321                connection.reconnect(context, info);
      322             } else {
      323                throw new InvalidClientIDException("Broker: " + getBrokerName() + " - Client: " + clientId + " already connected from " + context.getConnection().getRemoteAddress());
      324             }
      325          } else {
      326             //new connection
      327             context = connection.initContext(info);
      328             clientIdSet.put(clientId, context);
      329          }
      ... 

      In some cases, for instance, under heavy load + relatively poor system disk IO performance, it can cause significant performance issue. One thread might hold this lock while busy carrying out paging related operations and all other 40 or 50 threads were waiting for the lock in order to go ahead with  OpenWireProtocolManager.removeConnection() or OpenWireProtocolManager.addConnection() API call. The reason is that the removeConnection() method needs to disconnect the old connection before removing it from this HashMap. It might cause a lot of unnecessary invocations in the synchronized block. As a result, the broker might appear to be "hanging" because the broker can not remove old connections or add new connections. Obviously, this is not a real broker "hung" condition, but it still causes significant performance issue. For more details, please see attached thread dumps.

      It might be a good idea to use synchronized map instead of the HashMap. For instance:

      104.   private final Map<String, AMQConnectionContext> clientIdSet = Collections.synchronizedMap(new HashMap<String, AMQConnectionContext>());  

      Then we could get rid of the synchronized block on both "removeConnection" method and "addConnection" method to avoid such condition.

              csuconic@redhat.com Clebert Suconic
              rhn-support-qluo Joe Luo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: