-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
JBoss A-MQ 6.3
-
None
When testing disaster recovery scenario, where a third party process is "stealing" disk space from under the broker resulting in a no space on disk file (as expected). In one case out of approx 50 attempts a journal corruption was detected on subsequent restart.
The test case was also using ' preallocationStrategy="os_kernel_copy" '
No space on disk exception (as expected)
2017-03-08 16:02:44,884 | ERROR | heckpoint Worker | MessageDatabase | MessageDatabase$CheckpointRunner 421 | 163 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-630250 | Checkpoint failed java.io.IOException: No space left on device at java.io.RandomAccessFile.writeBytes0(Native Method)[:1.7.0_60] at java.io.RandomAccessFile.writeBytes(RandomAccessFile.java:520)[:1.7.0_60] at java.io.RandomAccessFile.write(RandomAccessFile.java:550)[:1.7.0_60] at org.apache.activemq.util.RecoverableRandomAccessFile.write(RecoverableRandomAccessFile.java:245)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.disk.page.PageFile.writeBatch(PageFile.java:1092)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.disk.page.PageFile.flush(PageFile.java:516)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1674)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase$17.execute(MessageDatabase.java:1643)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase$17.execute(MessageDatabase.java:1640)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.disk.page.Transaction.execute(Transaction.java:802)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1640)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:1043)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase$CheckpointRunner.run(MessageDatabase.java:416)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)[:1.7.0_60] │ at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)[:1.7.0_60] │ at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)[:1.7.0_60] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)[:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_60] at java.lang.Thread.run(Thread.java:745)[:1.7.0_60]
A journal corruption exception that is detected on restart
2017-03-08 16:07:29,861 | ERROR | AMQ-1-thread-1 | ActiveMQServiceFactory | Factory$ClusteredConfiguration$1 503 | 177 - io.fabric8.mq.mq-fabric - 1.2.0.redhat-630250 | Exception on start: java.io.IOException: Detected corrupt journal files. [35:0 >= key < 35:7032878] │ java.io.IOException: Detected corrupt journal files. [35:0 >= key < 35:7032878] │ at org.apache.activemq.store.kahadb.MessageDatabase.recoverIndex(MessageDatabase.java:955)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase$5.execute(MessageDatabase.java:701)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.disk.page.Transaction.execute(Transaction.java:779)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase.recover(MessageDatabase.java:698)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase.open(MessageDatabase.java:454)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:472)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.MessageDatabase.doStart(MessageDatabase.java:289)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.KahaDBStore.doStart(KahaDBStore.java:206)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.store.kahadb.KahaDBPersistenceAdapter.doStart(KahaDBPersistenceAdapter.java:223)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.broker.BrokerService.doStartPersistenceAdapter(BrokerService.java:661)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.broker.BrokerService.startPersistenceAdapter(BrokerService.java:645)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at org.apache.activemq.broker.BrokerService.start(BrokerService.java:610)[163:org.apache.activemq.activemq-osgi:5.11.0.redhat-630250] at io.fabric8.mq.fabric.ActiveMQServiceFactory$ClusteredConfiguration.doStart(ActiveMQServiceFactory.java:549)[177:io.fabric8.mq.mq-fabric:1.2.0.redhat-630250] at io.fabric8.mq.fabric.ActiveMQServiceFactory$ClusteredConfiguration.access$400(ActiveMQServiceFactory.java:359)[177:io.fabric8.mq.mq-fabric:1.2.0.redhat-630250] at io.fabric8.mq.fabric.ActiveMQServiceFactory$ClusteredConfiguration$1.run(ActiveMQServiceFactory.java:490)[177:io.fabric8.mq.mq-fabric:1.2.0.redhat-630250] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)[:1.7.0_60] │ at java.util.concurrent.FutureTask.run(FutureTask.java:262)[:1.7.0_60] │ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_60] at java.lang.Thread.run(Thread.java:745)[:1.7.0_60]
I noticed that the db-35.log was only approx 7.5 MB where the others where 50
rw-r-r-. 1 root root 52428800 Mar 8 15:52 db-32.log
rw-rr-. 1 root root 52428800 Mar 8 16:02 db-33.log
rw-rr-. 1 root root 52428800 Mar 8 16:02 db-34.log
rw-rr-. 1 root root 7634076 Mar 8 16:14 db-35.log