Skip to content

Conversation

@dependabot
Copy link

@dependabot dependabot bot commented on behalf of github Jun 19, 2021

Bumps tika-core from 1.6 to 1.22.

Changelog

Sourced from tika-core's changelog.

Release 2.0.0 - ???

  • Cleanup of fetcher integration with tika-server.

Release 2.0.0-BETA - 05/19/2021

  • Refactor pipes module for resilience

  • Add transcribe capability (TIKA-94).

Release 2.0.0-ALPHA - 01/13/2021

BREAKING CHANGES in 2.0.0

  • General

    • OCR is now triggered automatically for PDFs if tesseract is on the user's path see (https://cwiki.apache.org/confluence/display/TIKA/TikaOCR#TikaOCR-disable-ocr) for how to disable OCR.
    • We upgraded from log4j to log4j2 in tika-app, tika-server and anywhere else we used to use log4j.
    • By default, when rendering a page for OCR, the PDFParser does not render glyphs/text.
    • Removed deprecated Metadata keys/properties (TIKA-1974).
    • Removed deprecated PDFPreflightParser (TIKA-3437).
    • Removed dangerous calls to read an inputstream or convert to bytes without specifying a charset
    • Parsers can be configured via tika-config.xml on instantiation. We have moved away from configuration via .properties files because of confusion among users. This affects the PDFParser, TesseractOCRParser and the StringsParser.
  • tika-parsers

    • The parser modules have been broken into three main modules: tika-parsers-standard, tika-parsers-extended and tika-parsers-ml. Users may now need to add tika-parsers-extended to tika-app and tika-server to include parsers that used to be included by default (for example: envi, gdal, grib, isatab, netcdf).
    • ChmParser was moved to org.apache.tika.parser.microsoft.chm
    • RTFParser was moved to org.apache.tika.parser.microsoft.rtf
    • We are now using non-shaded versions of xmpcore with namespaces com.adobe.internal.* vs com.adobe.*.
    • We switched the underlying MP4 parser to Drew Noakes' metadata-extractor's MP4 parser from sannies' isoparser.
  • tika-app

  • tika-server

    • tika-server now by default forks a process to isolate the parsing in the forked process (this was called the -spawnChild option in tika-1.x). Clients must now expect that tika-server will restart on OOM, timeouts, crashes or after parsing a large number of files. When this happens tika-server will restand and not

... (truncated)

Commits
  • aa2a385 [maven-release-plugin] prepare release 1.22-rc4
  • de0fca9 roll back for rc#4...update date
  • 4db132e roll back for rc#4
  • c5daaf4 Merge remote-tracking branch 'origin/branch_1x' into branch_1x
  • 357c163 include opennlp lang model in tika-eval during assembly
  • 0f3790e [maven-release-plugin] prepare for next development iteration
  • c23f47e [maven-release-plugin] prepare release 1.23-rc3
  • c25b81d Merge remote-tracking branch 'origin/branch_1x' into branch_1x
  • fd40040 roll back for rc#3, again...
  • 950ee35 [maven-release-plugin] prepare for next development iteration
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
  • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
  • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
  • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

Bumps [tika-core](https://github.com/apache/tika) from 1.6 to 1.22.
- [Release notes](https://github.com/apache/tika/releases)
- [Changelog](https://github.com/apache/tika/blob/main/CHANGES.txt)
- [Commits](apache/tika@1.6...1.22)

---
updated-dependencies:
- dependency-name: org.apache.tika:tika-core
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Jun 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants