A Retrieval Augmented Generation (RAG) system for querying the Canadian Income Tax Act using natural language. This system enables users to ask questions about tax law and receive accurate, cited answers directly from the legislation.
This RAG system provides an interface to the Canadian Income Tax Act, allowing users to:
- Ask natural language questions about tax law
- Receive accurate answers with source citations
- Navigate complex legal documents through semantic search
- Access information from Subdivisions A & B of the Income Tax Act
- Intelligent Document Processing: Structure-aware chunking that preserves legal document hierarchy
- Deployment Options:
- Local: Powered by Ollama for complete privacy and offline operation
- AWS Cloud: Serverless deployment with AWS Bedrock for scalable production use
- Semantic Search: Vector embeddings using Ollama (local) or AWS Bedrock Titan (cloud)
- LLM Integration: Local Ollama models or AWS Bedrock Claude for answer generation
- Source Attribution: Every answer includes citations to specific sections
- Section-Based Chunking: Preserves complete legal sections for context
- Metadata: Tracks PART, DIVISION, SUBDIVISION, and marginal notes
The system consists of three main pipelines:
- Web Scraping: Downloads HTML from Justice Laws website
- Text Extraction: Cleans HTML and extracts structured markdown
- Intelligent Chunking: Section-based chunking preserving legal structure
- Embedding Generation: Creates embeddings using Ollama
- ChromaDB Storage: Persistent vector database for fast retrieval
- Metadata Indexing: Enables filtering by section, part, division, etc.
- Semantic Retrieval: Finds relevant sections using vector similarity
- Context Assembly: Builds prompts with retrieved context
- LLM Generation: Generates answers using Ollama LLM (local) or AWS Bedrock Claude (cloud)
- Serverless Infrastructure: AWS Lambda functions for API and RAG processing
- Vector Database: ChromaDB embedded in Lambda container with Bedrock Titan embeddings
- LLM: AWS Bedrock Claude 3 Haiku for answer generation
- Storage: DynamoDB for query tracking and results
- Async Processing: Separate API and worker Lambda functions for scalable query processing
- Authentication: IAM-based API security with signed requests
Here are example queries and responses from the system:
Question: What types of expenses are generally deductible when earning business income?
Answer:
Based on the provided context from the Income Tax Act, the types of expenses that are generally deductible when earning business income include:
1. Expenses related to a work space that is either the individual's principal place of business (Section 18(12)(a)(i)) or used exclusively for the purpose of earning income from business and used on a regular and continuous basis for meeting clients, customers, or patients in respect of the business (Section 18(12)(a)(ii)). However, the deductible amount cannot exceed the individual's income from the business.
2. For personal services businesses, there is an additional tax added to the corporation's taxable income for each taxation year (Section 123.5).
3. In computing a taxpayer's income for a taxation year from an office or employment, certain deductions are permitted, but not all expenses related to a work space within a self-contained domestic establishment where the individual resides can be deducted unless the work space is either the place where the individual principally performs the duties of the office or employment (Section 8(13)(a)(i)) or used exclusively during the period in respect of which the amount relates for the purpose of earning income from the office or employment and used on a regular and continuous basis for meeting customers or other persons in the ordinary course of performing the duties of the office or employment (Section 8(13)(a)(ii)). Again, the deductible amount cannot exceed the individual's income from the office or employment.
Sources:
- Section 18(12) (page-11.md)
- Section 123.5 (page-100.md)
- Section 8(2) (page-3.md)
- Section 5 (page-1.md)
- Section 8(13) (page-3.md)
Question: Are rental payments from tenants considered business income or passive income?
Answer:
Based on the provided context from the Income Tax Act, rental payments from tenants are considered income from a property as per Section 9(1) of the Act. However, it's important to note that this income does not include any capital gain from the disposition of the property, as mentioned in Section 9(3). For more specific information regarding deductions or other aspects related to rental income, additional sections and context may be needed.
Sources:
- Section 6(7) (page-1.md)
- Section 18(12) (page-11.md)
- Section 6(19) (page-1.md)
- Section 1 (page-93.md)
- Section 9 (page-4.md)
Question: What qualifies as "employment benefits" that must be included in income?
Answer:
Employment benefits that must be included in income, as per the provided context from the Income Tax Act, include:
1. Amounts deemed to be a benefit received in a taxation year by an individual due to a loan or debt, as stated in Section 6(9).
2. Portions of distributions made in a year that are described in certain subparagraphs or specified by the Minister as part of an agreement to waive or cancel a liability for tax under this Part, as mentioned in Section 207.061.
3. Amounts paid at any time in respect of a housing loss (other than an eligible housing loss) to or on behalf of a taxpayer or a person who does not deal at arm's length with the taxpayer in respect of, in the course of, or because of, an office or employment, as stated in Section 6(19).
4. Amounts paid or the value of assistance provided by any person in respect of, in the course of, or because of, an individual's office or employment in respect of the cost of, the financing of, the use of, or the right to use, a residence, as stated in Section 6(23).
Sources:
- Section 6(9) (page-1.md)
- Section 5 (page-1.md)
- Section 207.061 (page-187.md)
- Section 6(19) (page-1.md)
- Section 6(23) (page-1.md)
Question: Are scholarships or research grants always taxable?
Answer:
Based on the provided context from the Income Tax Act, scholarships or research grants are not explicitly mentioned as always being taxable. However, if a scholarship or research grant is received by a trust and it carries on one or more businesses or holds one or more properties that are not qualified investments for the trust, then tax may be payable under Part I of the Income Tax Act (Section 146.4(5)(b)).
It's also worth noting that registered charities and certain other organizations (like registered Canadian amateur athletic associations, registered journalism organizations) are required to file information returns with the Minister of National Revenue (Sections 149.1(14), 149.1(14.1)). However, this does not necessarily mean that scholarships or research grants received by these organizations are taxable.
In summary, while the provided context does not definitively answer whether scholarships or research grants are always taxable, it suggests that they may be taxable under certain circumstances if a trust holds them and carries on businesses or properties that are not qualified investments for the trust.
Sources:
- Section 146.4(5) (page-144.md)
- Section 212(9) (page-194.md)
- Section 149.1(14) (page-152.md)
- Section 146(10) (page-139.md)
- Section 149.1(20) (page-152.md)
- Python 3.8+
- Ollama installed and running
- Required Ollama models:
nomic-embed-text(for embeddings)mistral:latestor compatible LLM (for answer generation)
-
Clone the repository
git clone <repository-url> cd ITA-rag
-
Create a virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Install and start Ollama
# Install Ollama from https://ollama.ai/ # Then pull required models: ollama pull nomic-embed-text ollama pull mistral:latest # Start Ollama server (if not already running) ollama serve
-
Set up configuration Create and edit
config/config.pyif you need to customize any settings.
-
Download the Income Tax Act pages
python main.py --download
-
Extract and clean text
python main.py --extract
-
Build the vector store
python main.py --ingest
-
Query the system
python main.py --query "What is taxable income?"
Start an interactive session:
python main.py --interactivePreview how documents are chunked:
python preview_chunker.py --file data_clean/page-1.mdThe system uses section-based chunking that:
- Preserves complete legal sections (one chunk per section)
- Maintains hierarchical context (PART, DIVISION, SUBDIVISION)
- Includes marginal notes as metadata
- Splits very large sections (>35,000 chars) at subsection boundaries
- Preserves formulas, definitions, and nested lists
This approach ensures semantic completeness and maintains legal accuracy.
This system is provided for informational and educational purposes only. It should not be considered as legal or tax advice. Always consult with a qualified tax professional for specific tax situations.