From 57255f29024535f10d2d2213e47989e17d5e393f Mon Sep 17 00:00:00 2001 From: Rajiv Shah Date: Mon, 22 Sep 2025 12:01:07 -0500 Subject: [PATCH] Updated readme --- README.md | 52 +++++++++++++++++++++++++++++----------------------- 1 file changed, 29 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index 856cd9e..58753f9 100644 --- a/README.md +++ b/README.md @@ -10,21 +10,35 @@ This repository contains practical examples and demonstrations of how to interac Open In Colab -## ๐Ÿ“š Table of Contents - - - ๐Ÿš€ [End to End Example](01-getting-started/) - End to End example of the Contextual Platform - - ๐Ÿ”ฌ [Hands on Lab](02-hands-on-lab/) - Lab broken into three chapters, Creating Agent & Datastores, Evaluation, and Tuning - - ๐Ÿ”ง [Standalone API](03-standalone-api/) - Examples of using individual API endpoints like `/generate` and `/rerank`, `/parse` and `/lmunit`. - - ๐Ÿ“Š [Sheets Script](04-sheets-script/) - A Google Sheets script that automates form filling using Contextual AI's API integration. - - ๐Ÿ“ [Policy Changes](05-policy-changes/) - An example use case for tracking changes in long policy documents. - - ๐Ÿ“ˆ [Improving Agent](06-improve-agent-performance/) - Settings for improving or specializing your RAG agent. - - โš–๏ธ [Using RAGAS for Evaluation](07-evaluation-ragas/) - A walkthrough for using RAGAS on a RAG agent. - - ๐ŸŽฏ [LMUnit Evaluation for RewardBench](09-lmunit-rewardbench/) - Showing LMUnit for evaluating RewardBench. - - ๐ŸŽฏ [FACTS Benchmark](10-FACTS-benchmark/) - Benchmark for evaluating grounding for LLMs - - ๐Ÿ” [Retrieval Analysis](11-retrieval-analysis/) - Notebooks for an end-to-end evaluation of RAG retrieval - - ๐Ÿงพ [Structured Data Extraction](12-legal-contract-extraction/) - Showing how to perform extraction across legal documents. - - ๐Ÿ‘€ [Using Metrics API and Monitoring RAG](14-monitoring) - Showing how to monitor your RAG agent - - ๐Ÿท๏ธ [Metadata Intro](15-metadata-intro/) - Example notebook showing how to work with metadata +## Table of Contents + +### Getting Started +- [End to End Example](01-getting-started/) - Complete example of the Contextual Platform +- [Hands on Lab](02-hands-on-lab/) - Lab broken into three chapters: Creating Agent & Datastores, Evaluation, and Tuning +- [Standalone API](03-standalone-api/) - Examples of using individual API endpoints like `/generate`, `/rerank`, `/parse`, and `/lmunit` +- [Contextual AI MCP Server](https://github.com/ContextualAI/contextual-mcp-server) + +### Advanced Use Cases +- [Policy Changes](05-policy-changes/) - Tracking changes in long policy documents +- [Improving Agent Performance](06-improve-agent-performance/) - Settings for improving or specializing your RAG agent +- [Retrieval Analysis](11-retrieval-analysis/) - End-to-end evaluation of RAG retrieval +- [Structured Data Extraction](12-legal-contract-extraction/) - Extraction from unstructured legal documents +- [Monitoring RAG](14-monitoring) - Using Metrics API to monitor your RAG agent +- [Metadata Introduction](15-metadata-intro/) - Working with metadata in your RAG Agent + +### Integrations +- [CrewAI Multi-Agent Workflow](13-crewai-multiagent/) - Using CrewAI in a MultiAgent workflow +- [RAGAS Evaluation](07-evaluation-ragas/) - Using RAGAS for RAG agent evaluation +- [Google Sheets Script](04-sheets-script/) - Automating form filling using Contextual AI's API +- [Full Stack Deep Research with Gemini, Contextual AI, and LangGraph](https://github.com/rajshah4/contextualai-gemini-research-agent) +- [Deep Research Agent using Agno, Contextual AI, Tavily, and Langfuse](https://github.com/rajshah4/LLM-Evaluation/blob/main/ResearchAgent_Agno_LangFuse.ipynb) +- [Using Dify.AI with Contextual AI](https://www.youtube.com/watch?v=3WNUoKiwd2U) + +### Benchmarks & Evaluation +- [Reranker v2 Benchmarks](03-standalone-api/03-rerank/reranker_benchmarking.ipynb) - Performance evaluation of the reranker +- [LMUnit Evaluation for RewardBench](09-lmunit-rewardbench/) - Using LMUnit for evaluating RewardBench +- [FACTS Benchmark](10-FACTS-benchmark/) - Benchmark for evaluating grounding in LLMs +- [RAG QA Arena](https://github.com/rajshah4/LLM-Evaluation/tree/main/RAG_QA_Arena) - End-to-end RAG benchmark ## ๐Ÿš€ Getting Started @@ -50,14 +64,6 @@ This repository contains practical examples and demonstrations of how to interac - Contextual API credentials - Required Python packages (listed in `requirements.txt`) -## ๐Ÿ’ก Related Examples - -- ๐Ÿง  [Contextual AI MCP Server](https://github.com/ContextualAI/contextual-mcp-server) -- ๐Ÿ“š [Benchmarking with RAG QA Arena](https://github.com/rajshah4/LLM-Evaluation/tree/main/RAG_QA_Arena) -- ๐Ÿงช [Full Stack Deep Research with Gemini, Contextual AI, and LangGraph](https://github.com/rajshah4/contextualai-gemini-research-agent) -- ๐Ÿงญ [Deep Research Agent using Agno, Contextual AI, Tavily, and Langfuse](https://github.com/rajshah4/LLM-Evaluation/blob/main/ResearchAgent_Agno_LangFuse.ipynb) -- ๐Ÿ‘๏ธ [Using Dify.AI with Contextual AI](https://www.youtube.com/watch?v=3WNUoKiwd2U) - ## ๐Ÿค Contributing We welcome contributions! Feel free to: