-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
1.0.0.OCP.GA
-
None
-
-
- Access to the kafka pods and rename manually those log files, the pods then started up correctly.
AMQ Streams crashed when deleting logs and unable to start cleanly. The Kafka pods ends with crash/loop backoff status.
we can see in the following exception in the log:
December 11th 2019, 09:02:07.274 kafka 2019-12-11 09:02:07,274 INFO [ReplicaAlterLogDirsManager on broker 0] Removed fetcher for partitions __consumer_offsets-22,__consumer_offsets-30,bridge-alfa-0,__consumer_offsets-8,__consumer_offsets-21,bulki-feedback-store-0,__consumer_offsets-4,event-notification-0,bridge-ca-0,__consumer_offsets-27,bulki-event-log-0,__consumer_offsets-7,bulki-user-access-0,__consumer_offsets-9,__consumer_offsets-46,event-notification-2-0,__consumer_offsets-25,__consumer_offsets-35,feed-data-0,grants-redemption-replay-0,bulki-access-bulki-user-access-store-changelog-0,__consumer_offsets-41,__consumer_offsets-33,__consumer_offsets-23,__consumer_offsets-49,__consumer_offsets-47,__consumer_offsets-16,__consumer_offsets-28,bulki-access-updates-0,__consumer_offsets-31,__consumer_offsets-36,__consumer_offsets-42,__consumer_offsets-3,__consumer_offsets-18,__consumer_offsets-37,feeds-audit-jobs-0,__consumer_offsets-15,__consumer_offsets-24,message-stats-0,grants-settlement-replay-0,command-log-0,__consumer_offsets-38,__consumer_offsets-17,__consumer_offsets-48,bulki-account-bulki-user-account-store-changelog-0,bulki-account-updates-0,bulki-chunk-updates-3,__consumer_offsets-19,__consumer_offsets-11,asset-notification-0,rsam-live-replay-0,feeds-event-log-0,__consumer_offsets-13,__consumer_offsets-2,__consumer_offsets-43,__consumer_offsets-6,__consumer_offsets-14,bulki-chunk-updates-1,__consumer_offsets-20,__consumer_offsets-0,__consumer_offsets-44,__consumer_offsets-39,bulki-file-data-0,__consumer_offsets-12,__consumer_offsets-45,__consumer_offsets-1,__consumer_offsets-5,__consumer_offsets-26,__consumer_offsets-29,__consumer_offsets-34,__consumer_offsets-10,__consumer_offsets-32,__consumer_offsets-40,bulki-chunk-updates-2 (kafka.server.ReplicaAlterLogDirsManager) [LogDirFailureHandler] December 11th 2019, 09:02:07.269 kafka at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) December 11th 2019, 09:02:07.269 kafka at java.nio.file.Files.move(Files.java:1395) December 11th 2019, 09:02:07.269 kafka at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:786) December 11th 2019, 09:02:07.269 kafka at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:211) December 11th 2019, 09:02:07.269 kafka at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:488) December 11th 2019, 09:02:07.269 kafka at kafka.log.Log.$anonfun$replaceSegments$6(Log.scala:1814) December 11th 2019, 09:02:07.269 kafka at kafka.log.Log.$anonfun$replaceSegments$6$adapted(Log.scala:1814) December 11th 2019, 09:02:07.269 kafka at scala.collection.immutable.List.foreach(List.scala:389) December 11th 2019, 09:02:07.269 kafka at kafka.log.Log.replaceSegments(Log.scala:1814) December 11th 2019, 09:02:07.269 kafka at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:533) December 11th 2019, 09:02:07.269 kafka at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:465) December 11th 2019, 09:02:07.269 kafka at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:464) December 11th 2019, 09:02:07.269 kafka at scala.collection.immutable.List.foreach(List.scala:389) December 11th 2019, 09:02:07.269 kafka at kafka.log.Cleaner.doClean(LogCleaner.scala:464) December 11th 2019, 09:02:07.269 kafka at kafka.log.Cleaner.clean(LogCleaner.scala:442) December 11th 2019, 09:02:07.269 kafka at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:303) December 11th 2019, 09:02:07.269 kafka at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:289) December 11th 2019, 09:02:07.269 kafka at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) December 11th 2019, 09:02:07.269 kafka Suppressed: java.nio.file.FileAlreadyExistsException: /var/lib/kafka/kafka-log0/__consumer_offsets-16/00000000000000000000.log.swap -> /var/lib/kafka/kafka-log0/__consumer_offsets-16/00000000000000000000.log December 11th 2019, 09:02:07.269 kafka at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88) December 11th 2019, 09:02:07.269 kafka at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) December 11th 2019, 09:02:07.269 kafka at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396) December 11th 2019, 09:02:07.269 kafka at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) December 11th 2019, 09:02:07.269 kafka at java.nio.file.Files.move(Files.java:1395) December 11th 2019, 09:02:07.269 kafka at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:783) December 11th 2019, 09:02:07.269 kafka ... 15 more December 11th 2019, 09:02:07.269 kafka 2019-12-11 09:02:07,269 INFO [ReplicaManager broker=0] Stopping serving replicas in dir /var/lib/kafka/kafka-log0 (kafka.server.ReplicaManager) [LogDirFailureHandler] December 11th 2019, 09:02:07.268 kafka java.nio.file.FileAlreadyExistsException: /var/lib/kafka/kafka-log0/__consumer_offsets-16/00000000000000000000.log.swap -> /var/lib/kafka/kafka-log0/__consumer_offsets-16/00000000000000000000.log December 11th 2019, 09:02:07.268 kafka at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) December 11th 2019, 09:02:07.268 kafka 2019-12-11 09:02:07,268 ERROR Failed to clean up log for __consumer_offsets-16 in dir /var/lib/kafka/kafka-log0 due to IOException (kafka.server.LogDirFailureChannel) [kafka-log-cleaner-thread-0] December 11th 2019, 09:02:07.268 kafka at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88) December 11th 2019, 09:02:07.268 kafka at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:457) December 11th 2019, 09:02:07.095 kafka 2019-12-11 09:02:07,095 INFO Cleaner 0: Swapping in cleaned segment LogSegment(baseOffset=0, size=1612) for segment(s) List(LogSegment(baseOffset=0, size=1619), LogSegment(baseOffset=715905, size=16049337)) in log Log(/var/lib/kafka/kafka-log0/__consumer_offsets-16) (kafka.log.LogCleaner) [kafka-log-cleaner-thread-0] December 11th 2019, 09:02:06.986 kafka 2019-12-11 09:02:06,985 INFO Cleaner 0: Cleaning segment 0 in log __consumer_offsets-16 (largest timestamp Wed Dec 04 09:01:45 UTC 2019) into 0, retaining deletes. (kafka.log.LogCleaner) [kafka-log-cleaner-thread-0] December 11th 2019, 09:02:06.986 kafka 2019-12-11 09:02:06,986 INFO Cleaner 0: Cleaning segment 715905 in log __consumer_offsets-16 (largest timestamp Wed Dec 11 09:01:48 UTC 2019) into 0, retaining deletes. (kafka.log.LogCleaner) [kafka-log-cleaner-thread-0] December 11th 2019, 09:02:06.958 kafka 2019-12-11 09:02:06,958 INFO Cleaner 0: Cleaning log __consumer_offsets-16 (cleaning prior to Wed Dec 11 09:01:48 UTC 2019, discarding tombstones prior to Tue Dec 03 09:01:45 UTC 2019)... (kafka.log.LogCleaner) [kafka-log-cleaner-thread-0] December 11th 2019, 09:02:06.957 kafka 2019-12-11 09:02:06,957 INFO Cleaner 0: Offset map for log __consumer_offsets-16 complete. (kafka.log.LogCleaner) [kafka-log-cleaner-thread-0] December 11th 2019, 09:02:06.909 kafka 2019-12-11T09:02:06.865+0000: 57766.127: [GC pause (G1 Evacuation Pause) (young), 0.0431825 secs]
Environment:
- AMQ Streams 1.0.0
- OpenShift 3.11
- Storage: Gluster