-
Feature Request
-
Resolution: Done
-
Major
-
2.8.2.Final
-
None
The TikaTextExtractor uses the default value for its ContentHandler. This default value is limited to 100000 characters which is way to low to extract words from even mid-size documents (2.5MB). Please increase the default size or make it configurable in the repository configuration file.
Also please use logging facility to report any parser problems.
- is related to
-
MODE-1778 Upgrade to Tika 1.3 (or latest version)
- Closed