The issue was reliably reproduced on a cluster of 64 nodes with default clustered.xml configuration. The nodes receive a view and hang. Coordinator's log is cluttered with CacheNotFoundResponse & JGroups NPE's (included below). Other nodes just receive replication timeout while getting rebalancing status.
Thread dump (coordinator): https://gist.github.com/mcimbora/b65914bcd8141427fbf4
Exceptions (coordinator): https://gist.github.com/mcimbora/0dd9a5b53344c1cdb18b
Exceptions(other nodes): https://gist.github.com/mcimbora/49d8b9f1b983d17ca5b1