Skip to content

Comments

Bump marker-pdf from 1.7.5 to 1.8.1#306

Closed
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/marker-pdf-1.8.1
Closed

Bump marker-pdf from 1.7.5 to 1.8.1#306
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/marker-pdf-1.8.1

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Jul 7, 2025

Bumps marker-pdf from 1.7.5 to 1.8.1.

Release notes

Sourced from marker-pdf's releases.

Misc bugfixes

  • Document block correction prompt
  • Fix config issues

What's Changed

Full Changelog: datalab-to/marker@v1.8.0...v1.8.1

Chunk output; custom prompts; structured extraction improvements

Marker 1.8.0

  • Marker will now output a flat list of blocks with associated html, which is useful for RAG
  • Structured extraction beta is significantly improved, with better performance/accuracy
  • New LLM sectionheader processor will correctly label section header levels
  • You can pass a prompt to marker in LLM mode to adjust the output
  • Marker batch conversion script has somewhat better performance, closer to our inference container - email us at hi@datalab.to if you want to get setup with our inference container (used on prem at top AI research orgs)
  • Add an option to filter out blank white page images from output
  • Enable keeping pageheader/pagefooter.

Chunking/RAG improvements

  • Add chunk output format which is a flat list of chunks with full html in each
  • Add an llm sectionheader processor that will redo all the header levels against each other properly

Use the sectionheaderprocessor by setting --use_llm, and the chunk output by setting --output_format chunks.

Structured extraction

  • Fix structured extraction, so it works much better than before (requires llm)
  • Improve structured extraction test app

You can try with with the streamlit app by running python extraction_app.py.

Promptability/customization!

  • Add promptability via block_correction_prompt, which can be used to create custom behavior (requires llm)

Try it by setting the block_correction_prompt config key to a specific prompt.

Misc

  • Get the marker script to perform a bit closer to our inference container by default (inference container gets 10-25 pages/s on H100). Will auto-configure worker count to available VRAM.
  • Fix where marker would output blank pages as images
  • Enable keeping pageheader/pagefooter in the output
  • Adjust llm services to enable text-only input

... (truncated)

Commits
  • 34cb2bb Merge pull request #790 from datalab-to/dev
  • eeb6ca4 Merge pull request #784 from runarmod/fix-llm-retry-logic-azure-openai
  • e2fe0ee Fix misc bugs
  • f2a8c4f Merge pull request #786 from datalab-to/no-block-config
  • 453fa0a fix: handles when block config is none
  • e78100a fix query retry logic for azure openai
  • 4455640 only update block metadata if block is truthy
  • edbcb8c Merge pull request #774 from datalab-to/dev
  • 17d8c63 Fix schema issues
  • bd7add2 Fix attributes
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Jul 7, 2025
@github-actions github-actions bot enabled auto-merge July 7, 2025 18:51
@dependabot dependabot bot force-pushed the dependabot/pip/marker-pdf-1.8.1 branch from ad86764 to 631ed0b Compare July 14, 2025 20:37
@dependabot dependabot bot force-pushed the dependabot/pip/marker-pdf-1.8.1 branch from 631ed0b to 3681ad3 Compare July 14, 2025 20:54
@dependabot dependabot bot force-pushed the dependabot/pip/marker-pdf-1.8.1 branch from 3681ad3 to dcd2c97 Compare July 14, 2025 21:10
@dependabot dependabot bot force-pushed the dependabot/pip/marker-pdf-1.8.1 branch from dcd2c97 to 5090929 Compare July 14, 2025 21:27
Bumps [marker-pdf](https://github.com/VikParuchuri/marker) from 1.7.5 to 1.8.1.
- [Release notes](https://github.com/VikParuchuri/marker/releases)
- [Commits](datalab-to/marker@v1.7.5...v1.8.1)

---
updated-dependencies:
- dependency-name: marker-pdf
  dependency-version: 1.8.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot force-pushed the dependabot/pip/marker-pdf-1.8.1 branch from 5090929 to 3734ad9 Compare July 14, 2025 21:38
@dependabot @github
Copy link
Contributor Author

dependabot bot commented on behalf of github Jul 28, 2025

Superseded by #314.

@dependabot dependabot bot closed this Jul 28, 2025
auto-merge was automatically disabled July 28, 2025 18:20

Pull request was closed

@dependabot dependabot bot deleted the dependabot/pip/marker-pdf-1.8.1 branch July 28, 2025 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants