Agri-Query: A CASE STUDY ON RAG VS. LONG-CONTEXT LLMS FOR CROSS-LINGUAL TECHNICAL QUESTION ANSWERING
This main README serves as a central navigation point. Please refer to the specific README files and results folders within each project directory for detailed information, setup instructions, and findings.
- RAG (Retrieval-Augmented Generation):
- Project Details: RAG README
- RAG Evaluation: RAG Evaluation
- RAG Results: RAG Results Folder
(Note: Please ensure a
README.mdand aresultsfolder exist within theRAGdirectory for these links to work correctly. Adjust the results folder path if it's named differently).
This evaluation assesses the capability of Large Language Models (LLMs) to answer questions when provided with extensive context. The tests are conducted without any model fine-tuning, focusing on the LLMs' inherent ability to process and retrieve information from varying lengths of text.
Key Aspects of this Evaluation:
- Guaranteed Answer Presence: For each question, the context supplied to the LLM always contains the page with the correct answer. This setup tests the model's ability to locate information within the provided text, rather than its ability to recall information from prior training.
- Variable Context Lengths with "Noise": To simulate challenges of finding relevant information in large documents, tests are run with different context sizes. This includes scenarios where "noise" – additional, potentially irrelevant pages – is appended to the core context. For example, tests might involve adding 10k tokens of noise or using the entire document (approximately 59k tokens) as context.
- Performance Metrics: Model performance is measured using standard information retrieval metrics, including accuracy, precision, recall, and F1-score.
Codebase and Tools:
- "Zeroshot" Terminology: In this project's codebase, evaluations of this nature are referred to as "Zeroshot." The
Zeroshot/directory contains all relevant scripts and utilities for conducting these long context evaluations. - PDF to Text Conversion: To prepare PDF documents for this framework (specifically, converting them into a page-wise plain text format suitable for ingestion), use the
docling_page_wise_pdf_convertertool. This tool is located in thezeroshot/docling_page_wise_pdf_converter/directory.
Example Visualization:
The following image illustrates how the results from long context evaluations are typically visualized, showing accuracy against varying levels of noise:
Long context Accuracy comparison for English manual
- ZeroShot Evaluation:
- Project Details: ZeroShot README
- Results: ZeroShot Results Folder
- Visualizations: ZeroShot Visualization Plots
This README provides navigation. For specific project details and outcomes, please consult the files and folders linked above.