-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
3.1.0.Final
-
None
-
None
I am getting the following exception when using the tika text extractor to extract contents an excel document.
Exception in thread "modeshape-text-extractor-7-thread-1" java.lang.ExceptionInInitializerError
at org.apache.poi.openxml4j.opc.internal.unmarshallers.PackagePropertiesUnmarshaller.<clinit>(PackagePropertiesUnmarshaller.java:49)
at org.apache.poi.openxml4j.opc.OPCPackage.init(OPCPackage.java:154)
at org.apache.poi.openxml4j.opc.OPCPackage.<init>(OPCPackage.java:141)
at org.apache.poi.openxml4j.opc.Package.<init>(Package.java:54)
at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:99)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:207)
at org.apache.tika.parser.pkg.ZipContainerDetector.detectOfficeOpenXML(ZipContainerDetector.java:194)
at org.apache.tika.parser.pkg.ZipContainerDetector.detectZipFormat(ZipContainerDetector.java:134)
at org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:77)
at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
at org.modeshape.jcr.mimetype.TikaMimeTypeDetector.mimeTypeOf(TikaMimeTypeDetector.java:126)
at org.modeshape.jcr.mimetype.MimeTypeDetectors.mimeTypeOf(MimeTypeDetectors.java:74)
at org.modeshape.jcr.value.binary.AbstractBinaryStore.getMimeType(AbstractBinaryStore.java:161)
at org.modeshape.jcr.value.binary.StoredBinaryValue.getMimeType(StoredBinaryValue.java:69)
at org.modeshape.jcr.TextExtractors$Worker.run(TextExtractors.java:175)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.ClassCastException: org.dom4j.DocumentFactory cannot be cast to org.dom4j.DocumentFactory
at org.dom4j.DocumentFactory.getInstance(DocumentFactory.java:97)
at org.dom4j.tree.AbstractNode.<clinit>(AbstractNode.java:39)
Steps to reproduce
1. Try to read/parse an excel spread sheet
2. While the read/parse is in progress, try to save another excel spread sheet as attachment into JCR repository.