Skip to content

Releases: Extralit/extralit

v0.6.1: Integrated PDF processing workflow with extralit-hf-space and incremental Dataset building with imports

29 Aug 07:30

Choose a tag to compare

This release delivers major upgrades to document processing, import workflows, and exposed additional dataset-building functionalities in the UI. Highlights include OCRmyPDF-powered PDF processing via Redis jobs, a workspace selector at breadcrumb, and incremental import with dataset mapping.

What's Changed

  • [FEAT] integrate OCRmyPDF and on document upload in Redis Queue jobs by @priyankeshh and @JonnyTran in #115
  • [FIX] Import Files Flow by @JonnyTran in #120
  • [FEAT] Workspace Pinia Store and Dataset Breadcrumb Selector in AppHeader @JonnyTran in #121
  • [FIX] Import File Parsing and Matching Flow and Refactoring by @JonnyTran in #122
  • [FIX] DocumentAPI to query by params and return multiple documents & fix PDF file fetching by @JonnyTran in #123
  • [FEAT] minio presigned url for pdf by @JonnyTran in #124
  • [FIX] Import Analysis and Batch Refactoring, File Matching algorithm, Document Panel by @JonnyTran in #130
  • [FIX] Consolidating linting configuration by @JonnyTran in #133
  • [FEAT] Document workflows with rq jobs by @JonnyTran in #136
  • [FEAT] Import dataset mapping by @JonnyTran in #140

Contributors

Many thanks @priyankeshh for work on the https://github.com/Extralit/extralit-hf-space repo for PyMuPDF integration.
Welcome @Mr-Youssef-Sherif!

Full Changelog: v0.6.0...v0.6.1

v0.6.0: PDF Importer feature with BibTeX support and namespace refactoring

06 Aug 23:54

Choose a tag to compare

What's Changed

  • [FEATURE] Papers Library BibTeX Importer by @JonnyTran in #107
  • Update Frontend Dependencies: Migrate deprecated Babel plugins and refresh Vue 2 tooling by @Copilot in #113
  • feat(import-history): sidebar integration by @JonnyTran in #116
  • [FEAT] Rename question in Dataset Configuration by @JonnyTran in #117
  • Complete Python namespace refactor: argilla → extralit with directory restructure by @Copilot in #118
  • [RELEASE] v0.6.0 by @JonnyTran in #119

Full Changelog: v0.5.0...v0.6.0

v0.5.0: Latest Argilla Upgrade and Repo Restructuring

20 Jun 21:58

Choose a tag to compare

This release focuses on synchronizing with the latest changes from the upstream Argilla project, improved CI pipelines, restructuring from Argilla to Extralit, and introduces support for legacy migrations.

What's New

  • Upstream Argilla Sync (v2.6.0-v2.8.0):
    We've merged the latest changes from Argilla, bringing in new features and bug fixes. Key highlights include:

    • Similarity Search with Scores: The API now returns similarity scores when performing similarity searches, providing more context for your results.
    • Predefined IDs for Users & Workspaces: You can now create users and workspaces with predefined IDs, simplifying integration and migration workflows.
  • Legacy Migration Support:

    • To support users migrating from older versions, the extralit_v1 package has been added under argilla-v1/src/extralit_v1.
  • Project Refinements:

    • We have completed the project-wide refactoring from argilla to extralit, ensuring consistency across module paths and configurations like the cache directory (~/.extralit/).
    • The upload_file function and document listing process have been streamlined for a better user experience.

Upgrade Notes

  • To upgrade to the latest version, run:
    pip install --upgrade extralit

Contributors

A big thank you to our community for the continuous support, contributions, and feedback that made this release possible!

Full Changelog

extralit/CHANGELOG.md
v0.4.1...v0.5.0

v0.4.0: Argilla v2 API and CLI Rebuild

14 May 23:59

Choose a tag to compare

New Features

  • Major CLI Overhaul PR #57:
    The extralit CLI has been rebuilt and now includes comprehensive commands for workspace, file, document, and schema management. This makes it easier than ever to interact with Extralit from the command line, automate workflows, and integrate with other tools.
  • Workspace API Improvements:
    The Workspace API now supports more robust operations, improved error handling, and better logging for easier debugging and development.
  • Enhanced CLI error messages and user feedback.
  • Improved file and schema management commands.
  • Refactored codebase for better maintainability and developer experience.
  • Updated developer documentation and issue templates.

Upgrade Notes

  • Python 3.9+ required.
  • Upgrade with:
    pip install --upgrade extralit
  • To get started with the CLI:
    extralit --help

Contributors

Special thanks to the contributors of PR #57 - you've made a major milestone:

And thanks to everyone else (@ArthrowAbstract, @SanjayUG, @Nakshatra05) who contributed to this release through code, reviews, and feedback!

Full Changelog

argilla/CHANGELOG.md
v0.3.0...v0.4.0

v0.3.0: TableField and TableQuestion types

01 Dec 06:03

Choose a tag to compare

This release focuses on introducing table support for fields and questions in feedback datasets, along with infrastructure improvements.

New Features

  • Schema & Fields:
    • Added support for TableField and es_field_for_record_field for table fields
    • Added TableQuestion and TableQuestionSetting to support table questions

Infrastructure Improvements

  • DevOps:
    • Added redis service to the Tilt k8s deployment for argilla-server
    • Improved argilla-server and extralit-server dockerfile multi-stage build
    • Changed envvars in Tilt k8s deployment at argilla-server-deployment.yaml

Bug Fixes

  • Fixed elasticsearch reindexing errors with dynamic schema
  • Fixed certain extralit-specific changes when loading Dataset

Full Changelog

v0.2.2...v0.3.0

v0.2.1: devcontainers and unit test for files and documents

17 Sep 00:55

Choose a tag to compare

This release focuses on enhancing the continuous integration, testing, and DevOps setup, ensuring a more robust and efficient development workflow.

New Features

  • Development Environment:

    • Added singleton schema support in SchemaStructure.
    • Added docs site for the Extralit project at argilla/docs/.
    • Added pytest-xdist for parallel testing.
    • Pytest and Python environment setup in the "PostgreSQL & Elasticsearch for Docker-Compose" GitHub Codespaces devcontainer.
    • Added .devcontainer for "Docker, Tilt, and K8s" local development on GitHub Codespaces.
  • Testing:

    • Added tests for:
      • Response: update duration.
      • Files: get, put, list, delete.
      • Models: get, post, put, delete.
      • Records: include response_suggestions.

Changes

  • Dependencies:

    • Updated Elasticsearch to 8.15.0.
  • Database:

    • Reverted Suggestion table's unique constraint to only "record_id" and "question_id", fixing the test suites.
  • API:

    • Disabled adding LIST_DATASET_RECORDS_DEFAULT_SORT_BY when there's no sort-by on GET records.
    • Changed the /api/v1/documents POST endpoint to use UploadFile.
  • DevOps:

    • Changed K8s Elasticsearch deployment from Helm to docker.elastic.co/elasticsearch/elasticsearch to fix PVC restarting issues.
    • Refactored Extralit Dockerfile and Docker Hub images to extralit/argilla-server and extralit/argilla-quickstart.
    • Changed develop branch changes in argilla/docs to https:/docs.extralit.ai/latest instead of dev.
    • Changed examples/deployments/k8s/extralit-configs.yaml for configuring the Extralit service and secrets in a K8s cluster.

Bug Fixes

  • Fixed Tiltfile and k8s manifests for mono-repo setup.
  • Fixed creating a new Weaviate collection with Weaviate client v4.
  • Fixed an error with checking Weaviate collection existence when one doesn't exist.
  • Fixed an issue with reindexing Elasticsearch by handling exceptions on failed datasets.
  • Added Workspace relationship Document to enable cascade delete.

Security

  • Allow admin role for workspace creation.

Full Changelog: v0.2.0...v0.2.1

v0.2.0: Extralit CLI workspace management and Github Actions CI workflows

31 Jul 01:02

Choose a tag to compare

This release following Argilla v1.29.1 brings significant improvements to the Extralit CLI, workspace management, and various bug fixes and enhancements to ensure a smoother user experience.

New Features

  • Workspace Management:

    • Added workspace schema and file management to the Extralit CLI.
    • Refined workspace schema and file management in the Extralit CLI.
    • Updated rg.Workspace with update_schemas and get_schemas methods.
    • Enabled _ID reference IDs in schemas.
    • Added inserted_at and updated_at fields to Suggestion.
  • CLI Enhancements:

    • Introduced the Extralit CLI for improved command-line interactions.
  • User Interface:

    • Updated status filter options in StatusFilter.vue and RecordRepository.ts.
    • Added tooltip in LabelSelection.
  • Translation and Localization:

    • Updated translation for "Use Table" option.
    • Added use_table option to QuestionSetting.

Bug Fixes

  • Fixed import statements in SchemaStructure and Workspace.
  • Ensured .mjs files are properly transpiled with babel-loader.
  • Fixed validation errors in FeedbackRecord suggestions to server payload.
  • Fixed RecordRepository.ts to remove fetching "All data".

Continuous Integration and Deployment

  • Updated GitHub Actions and updated Docker Hub image name deployments.
  • Added GitHub Codespaces in .devcontainer.
  • Updated package names and build configurations for Extralit.
  • Set up mono repo to merge extralit-server.

Documentation

  • Updated README.md with new information.

Miscellaneous

  • Updated pip dependencies for Python tests.
  • Updated community links.

Full Changelog: v1.27.0a...v0.2.0