An AI-powered recruitment pipeline that parses Job Descriptions and Resumes, performs gap analysis using LLM reasoning, and drafts personalised outreach emails β all orchestrated by a multi-agent CrewAI crew.
Note: Live link may not work since backend deployment is not done as it gets charged due to heavy models used. However, it works locally. Comments are added throughout the project for better understanding.
Recruitment.Orchestration.Demo.mp4
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Next.js 14 Frontend β
β ββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββββββ β
β β Upload β β Candidate β β Email Preview/Edit β β
β β Panel + β β Cards + AI β β Modal β β
β β Top-N Ctrl β β Insights β β β β
β ββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββββββ β
βββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β REST API (proxied via Next.js rewrites)
βββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend (async) β
β ββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Ingestionβ β CrewAI Agent Pipeline β β
β β Engine β β ββββββββββββββ βββββββββββββ ββββββββββββ β β
β β (PyMuPDF)β β β Researcher βββ Evaluator βββ Writer β β β
β ββββββββββββ β ββββββββββββββ βββββββββββββ ββββββββββββ β β
β ββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β ChromaDB β β Human-in-the-Loop approval gate β
β β (Vectors)β β Session isolation (auto-reset per JD) β
β ββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Layer | Technology |
|---|---|
| LLM Provider | Groq (Llama 3.3 70B Versatile) via CrewAI + LiteLLM |
| Embeddings | all-MiniLM-L6-v2 (local, sentence-transformers β no API key needed) |
| Vector DB | ChromaDB (persistent, cosine similarity) |
| Agents | CrewAI (sequential 3-agent crew) |
| PDF Parsing | PyMuPDF (fitz) |
| Backend | FastAPI (async, in-memory state) |
| Frontend | Next.js 14 + React 18 + Tailwind CSS + Radix UI |
| Agent | Role |
|---|---|
| Researcher | Analyses the JD β extracts technical reqs, soft skills, culture fit |
| Evaluator | Scores candidates with reasoning & gap analysis (trainable skills) |
| Writer | Drafts personalised outreach emails referencing specific projects |
- Upload JD + Resumes
- Set Top-N candidates to analyse (defaults to 5, clamped to resume count)
- Launch pipeline β Researcher + Evaluator run automatically
- Pipeline pauses at "Awaiting Approval"
- User reviews AI shortlist, selects approved candidates
- Writer drafts emails only for approved candidates
- User can edit emails before sending
- Automatic reset: Uploading a new JD clears all previous state β documents, pipeline runs, and ChromaDB embeddings
- Explicit reset:
POST /api/session/resetwipes everything for a clean slate - No cross-session leakage: Each JD upload starts a fresh session with no stale data
- Python 3.11+
- Node.js 18+
- A Groq API key (console.groq.com)
cd backend
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # macOS/Linux
pip install -r requirements.txt
# Create .env with your Groq API key
echo GROQ_API_KEY=gsk_your_key_here > .env
# Run
uvicorn app.main:app --reload --port 8000cd frontend
npm install
npm run dev| Variable | Required | Default | Description |
|---|---|---|---|
GROQ_API_KEY |
Yes | β | Groq API key for LLM calls |
GROQ_MODEL |
No | llama-3.3-70b-versatile |
Groq model identifier |
EMBEDDING_MODEL |
No | all-MiniLM-L6-v2 |
Local sentence-transformers model |
DEFAULT_TOP_N |
No | 5 |
Default number of top candidates |
FRONTEND_URL |
No | http://localhost:3000 |
Allowed CORS origin |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/upload/jd |
Upload a Job Description (PDF/TXT) β resets session |
| POST | /api/upload/resumes |
Upload multiple Resume PDFs |
| GET | /api/documents |
List all uploaded documents |
| POST | /api/pipeline/start |
Start the agent pipeline (top_n clamped to resume count) |
| GET | /api/pipeline/{run_id} |
Poll pipeline status & results |
| POST | /api/pipeline/{run_id}/approve |
Approve shortlisted candidates (HITL) |
| PUT | /api/pipeline/{run_id}/emails/{rid} |
Edit a drafted outreach email |
| POST | /api/session/reset |
Explicitly reset all session state |
βββ backend/
β βββ app/
β β βββ __init__.py
β β βββ config.py # Env-based configuration (Groq keys, paths)
β β βββ models.py # Pydantic schemas
β β βββ ingestion.py # PDF/TXT extraction (PyMuPDF)
β β βββ vector_store.py # ChromaDB embeddings, search & reset
β β βββ agents.py # CrewAI agent definitions (Groq-powered)
β β βββ main.py # FastAPI application & endpoints
β βββ uploads/ # Uploaded files (gitignored)
β βββ chroma_db/ # Persistent vector store (gitignored)
β βββ requirements.txt
βββ frontend/
β βββ src/
β β βββ app/
β β β βββ globals.css
β β β βββ layout.tsx
β β β βββ page.tsx # Command Center dashboard
β β βββ components/
β β β βββ ui/ # Shadcn/UI primitives
β β β βββ CandidateCard.tsx # Match scores, AI insights, gap analysis
β β β βββ EmailModal.tsx # Edit outreach emails
β β β βββ UploadPanel.tsx # JD/resume upload + Top-N control
β β β βββ StatusBanner.tsx # Pipeline status indicator
β β βββ lib/
β β βββ api.ts # API client (incl. session reset)
β β βββ types.ts # TypeScript types
β β βββ utils.ts # cn() utility
β βββ package.json
β βββ next.config.js # API proxy rewrites to FastAPI
β βββ tailwind.config.js
β βββ tsconfig.json
βββ README.md
- Groq-only β all LLM calls use Groq (Llama 3.3 70B); no OpenAI dependency
- Local embeddings β sentence-transformers
all-MiniLM-L6-v2runs locally; no embedding API key needed - No keyword matching β all evaluation uses LLM chain-of-thought reasoning
- Session isolation β uploading a new JD auto-resets ChromaDB and in-memory state to prevent cross-session data leakage
- Dynamic Top-N β user chooses how many candidates to analyse; backend clamps to actual resume count
- Async pipeline β FastAPI background tasks with 3-second polling from the frontend
- Human-in-the-loop β pipeline pauses for approval before email drafting
- Chunked embeddings β resumes are chunked (2000 chars, 200 overlap) for better retrieval
- Cosine similarity β ChromaDB uses cosine distance for semantic search