Skip to content

Legacy AdaParse/pdfwf #30

Draft
7shoe wants to merge 27 commits intoramanathanlab:mainfrom
7shoe:legacy
Draft

Legacy AdaParse/pdfwf #30
7shoe wants to merge 27 commits intoramanathanlab:mainfrom
7shoe:legacy

Conversation

@7shoe
Copy link
Collaborator

@7shoe 7shoe commented Oct 23, 2025

Description

An artifact of the the fork of AdaParsev1. This version and pdfwf are outdated and not Aurora-ready.
Improvements of AdaParsev2 include: better text accuracy prediction, automatic fill in for MISSING pages (Nougat), and a more robust pre-processing pipeline. I'll de-fork the two.

Fixes

  • Fixes #XX
  • Fixes #XX

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactoring (internal implementation changes)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update (no changes to the code)
  • CI change (changes to CI workflows, packages, templates, etc.)
  • Version changes (changes to the package or dependency versions)

Testing

N/A

Pull Request Checklist

Please confirm the PR meets the following requirements.

  • Relevant tags are added (breaking, bug, dependencies, documentation, enhancement, refactor).
  • Code changes pass pre-commit (e.g., ruff, mypy, etc.).
  • Tests have been added to show the fix is effective or that the new feature works.
  • New and existing unit tests pass locally with the changes.
  • Docs have been updated and reviewed if relevant.

@7shoe 7shoe self-assigned this Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants