Uploaded image for project: 'ModeShape'
  1. ModeShape
  2. MODE-1300

Upgrade to Tika 1.0

XMLWordPrintable

      Tika 1.0 is now available, as of Nov 7. We should upgrade to get the latest fixes. This may cause a few problems, because they deprecated some APIs in 0.10, but we're still only using 0.9 so we might not have seen the deprecations; see the changes for details. There were some improvements in the processing of RTF, Word and PDF files.

      We'll also want to be sure we exclude any of the libraries we don't really use. We really want those that process PDF (pdfbox, fontbox, jempbox, bcmail, bcprov) and MS Office files (poi, poi-ooxml, poi-ooxml-schemas, poi-scratchpad, commons-codec, xmlbeans, dom4j, geronimo-stax-api).

              hchiorean Horia Chiorean (Inactive)
              rhauch Randall Hauch (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: