Skip to content

Multi-resume parsing & deep semantic analytics against JDs. Features an automated workflow to evaluate candidate fit and generate HR-ready email drafts. Built with FastAPI, Next.js, and Groq. πŸš€

Notifications You must be signed in to change notification settings

coderpawan/Agentic-Recruitment-Orchestrator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Agentic Recruitment Orchestrator

An AI-powered recruitment pipeline that parses Job Descriptions and Resumes, performs gap analysis using LLM reasoning, and drafts personalised outreach emails β€” all orchestrated by a multi-agent CrewAI crew.

Note: Live link may not work since backend deployment is not done as it gets charged due to heavy models used. However, it works locally. Comments are added throughout the project for better understanding.

Demo

Recruitment.Orchestration.Demo.mp4

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Next.js 14 Frontend                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ Upload     β”‚  β”‚ Candidate      β”‚  β”‚ Email Preview/Edit   β”‚   β”‚
β”‚  β”‚ Panel +    β”‚  β”‚ Cards + AI     β”‚  β”‚ Modal                β”‚   β”‚
β”‚  β”‚ Top-N Ctrl β”‚  β”‚ Insights       β”‚  β”‚                      β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚ REST API (proxied via Next.js rewrites)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  FastAPI Backend (async)                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Ingestionβ”‚  β”‚          CrewAI Agent Pipeline               β”‚  β”‚
β”‚  β”‚ Engine   β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚ (PyMuPDF)β”‚  β”‚  β”‚ Researcher β”‚β†’β”‚ Evaluator β”‚β†’β”‚  Writer  β”‚  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β”‚ ChromaDB β”‚    ↑ Human-in-the-Loop approval gate               β”‚
β”‚  β”‚ (Vectors)β”‚    ↑ Session isolation (auto-reset per JD)         β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Tech Stack

Layer Technology
LLM Provider Groq (Llama 3.3 70B Versatile) via CrewAI + LiteLLM
Embeddings all-MiniLM-L6-v2 (local, sentence-transformers β€” no API key needed)
Vector DB ChromaDB (persistent, cosine similarity)
Agents CrewAI (sequential 3-agent crew)
PDF Parsing PyMuPDF (fitz)
Backend FastAPI (async, in-memory state)
Frontend Next.js 14 + React 18 + Tailwind CSS + Radix UI

The Three Agents

Agent Role
Researcher Analyses the JD β†’ extracts technical reqs, soft skills, culture fit
Evaluator Scores candidates with reasoning & gap analysis (trainable skills)
Writer Drafts personalised outreach emails referencing specific projects

Human-in-the-Loop Flow

  1. Upload JD + Resumes
  2. Set Top-N candidates to analyse (defaults to 5, clamped to resume count)
  3. Launch pipeline β†’ Researcher + Evaluator run automatically
  4. Pipeline pauses at "Awaiting Approval"
  5. User reviews AI shortlist, selects approved candidates
  6. Writer drafts emails only for approved candidates
  7. User can edit emails before sending

Session Management

  • Automatic reset: Uploading a new JD clears all previous state β€” documents, pipeline runs, and ChromaDB embeddings
  • Explicit reset: POST /api/session/reset wipes everything for a clean slate
  • No cross-session leakage: Each JD upload starts a fresh session with no stale data

Quick Start

Prerequisites

Backend

cd backend
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # macOS/Linux

pip install -r requirements.txt

# Create .env with your Groq API key
echo GROQ_API_KEY=gsk_your_key_here > .env

# Run
uvicorn app.main:app --reload --port 8000

Frontend

cd frontend
npm install
npm run dev

Open http://localhost:3000

Environment Variables

Variable Required Default Description
GROQ_API_KEY Yes β€” Groq API key for LLM calls
GROQ_MODEL No llama-3.3-70b-versatile Groq model identifier
EMBEDDING_MODEL No all-MiniLM-L6-v2 Local sentence-transformers model
DEFAULT_TOP_N No 5 Default number of top candidates
FRONTEND_URL No http://localhost:3000 Allowed CORS origin

API Endpoints

Method Endpoint Description
POST /api/upload/jd Upload a Job Description (PDF/TXT) β€” resets session
POST /api/upload/resumes Upload multiple Resume PDFs
GET /api/documents List all uploaded documents
POST /api/pipeline/start Start the agent pipeline (top_n clamped to resume count)
GET /api/pipeline/{run_id} Poll pipeline status & results
POST /api/pipeline/{run_id}/approve Approve shortlisted candidates (HITL)
PUT /api/pipeline/{run_id}/emails/{rid} Edit a drafted outreach email
POST /api/session/reset Explicitly reset all session state

Project Structure

β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ config.py           # Env-based configuration (Groq keys, paths)
β”‚   β”‚   β”œβ”€β”€ models.py           # Pydantic schemas
β”‚   β”‚   β”œβ”€β”€ ingestion.py        # PDF/TXT extraction (PyMuPDF)
β”‚   β”‚   β”œβ”€β”€ vector_store.py     # ChromaDB embeddings, search & reset
β”‚   β”‚   β”œβ”€β”€ agents.py           # CrewAI agent definitions (Groq-powered)
β”‚   β”‚   └── main.py             # FastAPI application & endpoints
β”‚   β”œβ”€β”€ uploads/                # Uploaded files (gitignored)
β”‚   β”œβ”€β”€ chroma_db/              # Persistent vector store (gitignored)
β”‚   └── requirements.txt
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”‚   β”œβ”€β”€ globals.css
β”‚   β”‚   β”‚   β”œβ”€β”€ layout.tsx
β”‚   β”‚   β”‚   └── page.tsx        # Command Center dashboard
β”‚   β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”‚   β”œβ”€β”€ ui/             # Shadcn/UI primitives
β”‚   β”‚   β”‚   β”œβ”€β”€ CandidateCard.tsx   # Match scores, AI insights, gap analysis
β”‚   β”‚   β”‚   β”œβ”€β”€ EmailModal.tsx      # Edit outreach emails
β”‚   β”‚   β”‚   β”œβ”€β”€ UploadPanel.tsx     # JD/resume upload + Top-N control
β”‚   β”‚   β”‚   └── StatusBanner.tsx    # Pipeline status indicator
β”‚   β”‚   └── lib/
β”‚   β”‚       β”œβ”€β”€ api.ts          # API client (incl. session reset)
β”‚   β”‚       β”œβ”€β”€ types.ts        # TypeScript types
β”‚   β”‚       └── utils.ts        # cn() utility
β”‚   β”œβ”€β”€ package.json
β”‚   β”œβ”€β”€ next.config.js          # API proxy rewrites to FastAPI
β”‚   β”œβ”€β”€ tailwind.config.js
β”‚   └── tsconfig.json
└── README.md

Key Design Decisions

  • Groq-only β€” all LLM calls use Groq (Llama 3.3 70B); no OpenAI dependency
  • Local embeddings β€” sentence-transformers all-MiniLM-L6-v2 runs locally; no embedding API key needed
  • No keyword matching β€” all evaluation uses LLM chain-of-thought reasoning
  • Session isolation β€” uploading a new JD auto-resets ChromaDB and in-memory state to prevent cross-session data leakage
  • Dynamic Top-N β€” user chooses how many candidates to analyse; backend clamps to actual resume count
  • Async pipeline β€” FastAPI background tasks with 3-second polling from the frontend
  • Human-in-the-loop β€” pipeline pauses for approval before email drafting
  • Chunked embeddings β€” resumes are chunked (2000 chars, 200 overlap) for better retrieval
  • Cosine similarity β€” ChromaDB uses cosine distance for semantic search

About

Multi-resume parsing & deep semantic analytics against JDs. Features an automated workflow to evaluate candidate fit and generate HR-ready email drafts. Built with FastAPI, Next.js, and Groq. πŸš€

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published