TF-IDF Search Engine in Go

A simple and fast TF-IDF-based text search engine written in Go. It supports tokenization, log-scaled term frequency and inverse document frequency weighting, query vector construction, and cosine similarity ranking.

🛝 Based off the Presentation

https://docs.google.com/presentation/d/1ZmHTDNNzgtjNR6vbSmzhrVjvs5qJ-yWKpv9TrPyPbcE/edit?usp=sharing

🚀 Features

Tokenizes and indexes a set of short documents
Computes smoothed log TF-IDF vectors
Supports vectorized cosine similarity for ranking
Returns top-k most relevant documents for a query

🧱 Project Structure

tfidf/
├── go.mod              // Module definition
├── main.go             // Entry point
├── model/              // Shared types (Document, Vector, Score)
├── pipeline/           // Tokenizer, TF-IDF logic, search engine
├── data/               // Used to extract documents from an example corpus

🛠️ Getting Started

1. Clone the repo

git clone https://github.com/JacobMcKenzieSmarty/tfidf.git
cd tfidf

2. Run the project

go run main.go

🔍 Example

docs := []model.Document{
    {0, "apple orange banana"},
    {1, "banana apple"},
    {2, "computer science and data"},
}
query := "banana apple"

Output:

Rank 1: Doc 1 (score: 0.9765)
Rank 2: Doc 0 (score: 0.6123)
Rank 3: Doc 2 (score: 0.0000)

📦 Dependencies

No external libraries — pure Go!

📜 License

MIT License — feel free to use, modify, and contribute!

🤝 Contributing

PRs welcome! Open issues or feature requests freely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TF-IDF Search Engine in Go

🛝 Based off the Presentation

🚀 Features

🧱 Project Structure

🛠️ Getting Started

1. Clone the repo

2. Run the project

🔍 Example

📦 Dependencies

📜 License

🤝 Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
model		model
pipeline		pipeline
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
main.go		main.go

Folders and files

Latest commit

History

Repository files navigation

TF-IDF Search Engine in Go

🛝 Based off the Presentation

🚀 Features

🧱 Project Structure

🛠️ Getting Started

1. Clone the repo

2. Run the project

🔍 Example

📦 Dependencies

📜 License

🤝 Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages