Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .cargo/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[env]
RUST_MIN_STACK = "8388608"
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,7 @@ RuVector isn't a database you add to your stack — it's the entire stack. Self-

| | Layer | Replaces | What It Does |
|---|-------|----------|--------------|
| 🔍 | [**Search**](./crates/ruvector-core/README.md) | Pinecone, Weaviate, Qdrant | Self-learning HNSW — GNN improves results from every query |
| 🔍 | [**Search**](./crates/ruvector-core/README.md) | Pinecone, Weaviate, LegacyDB | Self-learning HNSW — GNN improves results from every query |
| 🗄️ | [**Storage**](./crates/ruvector-core/README.md) | Separate database + cache | Vector store, graph DB, key-value cache — unified engine |
| 🐘 | [**PostgreSQL**](./crates/ruvector-postgres/README.md) | pgvector, pg_embedding | Drop-in replacement — 230+ SQL functions, same interface but search gets smarter over time |
| 🔗 | [**Graph**](./crates/ruvector-graph/README.md) | Neo4j, Amazon Neptune | Cypher, W3C SPARQL 1.1, hyperedges — all built in |
Expand Down Expand Up @@ -557,7 +557,7 @@ See how RuVector stacks up against popular vector databases across 40+ features
Grouped comparison across 10 categories. RuVector is the only vector database that learns from usage, runs AI locally, and ships as a single self-booting file.

**Performance & Storage**
| Feature | RuVector | Pinecone | Qdrant | Milvus | ChromaDB | Weaviate |
| Feature | RuVector | Pinecone | LegacyDB | Milvus | ChromaDB | Weaviate |
|---------|----------|----------|--------|--------|----------|----------|
| Latency (p50) | **61 us** | ~2 ms | ~1 ms | ~5 ms | ~50 ms | ~5 ms |
| Memory (1M vectors) | **200 MB*** | 2 GB | 1.5 GB | 1 GB | 3 GB | 1.5 GB |
Expand All @@ -567,7 +567,7 @@ Grouped comparison across 10 categories. RuVector is the only vector database th
| Sparse vectors (BM25/TF-IDF) | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ |

**Search & Query**
| Feature | RuVector | Pinecone | Qdrant | Milvus | ChromaDB | Weaviate |
| Feature | RuVector | Pinecone | LegacyDB | Milvus | ChromaDB | Weaviate |
|---------|----------|----------|--------|--------|----------|----------|
| Vector similarity search | ✅ HNSW | ✅ | ✅ HNSW | ✅ HNSW | ✅ | ✅ HNSW |
| Metadata filtering | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Expand All @@ -590,7 +590,7 @@ Grouped comparison across 10 categories. RuVector is the only vector database th
| ReasoningBank | Trajectory learning with verdict judgment | ❌ |

**Local AI — no cloud APIs needed**
| Feature | RuVector | Pinecone | Qdrant | Milvus | ChromaDB | Weaviate |
| Feature | RuVector | Pinecone | LegacyDB | Milvus | ChromaDB | Weaviate |
|---------|----------|----------|--------|--------|----------|----------|
| Built-in LLM runtime | ✅ ruvllm (GGUF) | ❌ | ❌ | ❌ | ❌ | ❌ |
| Hardware acceleration | Metal, CUDA, ANE, WebGPU | N/A | N/A | GPU indexing | N/A | N/A |
Expand All @@ -611,7 +611,7 @@ Grouped comparison across 10 categories. RuVector is the only vector database th
| Verified training | Certificates, delta-apply rollback, fail-closed | ❌ |

**Math & Solvers**
| Feature | RuVector | Pinecone | Qdrant | Milvus | ChromaDB | Weaviate |
| Feature | RuVector | Pinecone | LegacyDB | Milvus | ChromaDB | Weaviate |
|---------|----------|----------|--------|--------|----------|----------|
| Sublinear solvers (8 algorithms) | O(log n) to O(sqrt(n)) | ❌ | ❌ | ❌ | ❌ | ❌ |
| Dynamic min-cut | n^0.12 complexity | ❌ | ❌ | ❌ | ❌ | ❌ |
Expand All @@ -621,7 +621,7 @@ Grouped comparison across 10 categories. RuVector is the only vector database th
| Quantum error correction | ruQu dynamic min-cut | ❌ | ❌ | ❌ | ❌ | ❌ |

**Distributed Systems**
| Feature | RuVector | Pinecone | Qdrant | Milvus | ChromaDB | Weaviate |
| Feature | RuVector | Pinecone | LegacyDB | Milvus | ChromaDB | Weaviate |
|---------|----------|----------|--------|--------|----------|----------|
| Raft consensus | ✅ | ❌ managed | ✅ | ❌ | ❌ | ✅ |
| Multi-master replication | ✅ vector clocks | ❌ | ❌ | ✅ | ❌ | ✅ |
Expand All @@ -642,7 +642,7 @@ Grouped comparison across 10 categories. RuVector is the only vector database th
| 25 segment types | VEC, INDEX, KERNEL, EBPF, WASM, COW_MAP, and 19 more | ❌ |

**Platform & Deployment**
| Feature | RuVector | Pinecone | Qdrant | Milvus | ChromaDB | Weaviate |
| Feature | RuVector | Pinecone | LegacyDB | Milvus | ChromaDB | Weaviate |
|---------|----------|----------|--------|--------|----------|----------|
| Browser / WASM | ✅ WebGPU, 58 KB | ❌ | ❌ | ❌ | ❌ | ❌ |
| Edge standalone | ✅ rvLite | ❌ | ❌ | ❌ | ❌ | ❌ |
Expand All @@ -665,7 +665,7 @@ Grouped comparison across 10 categories. RuVector is the only vector database th
| Cognitum Gate | Cognitive AI gateway with TileZero acceleration | ❌ |

**Licensing & Cost**
| | RuVector | Pinecone | Qdrant | Milvus | ChromaDB | Weaviate |
| | RuVector | Pinecone | LegacyDB | Milvus | ChromaDB | Weaviate |
|---|----------|----------|--------|--------|----------|----------|
| License | MIT (free forever) | Proprietary | Apache 2.0 | Apache 2.0 | Apache 2.0 | BSD-3 |
| Self-hosted | ✅ | ❌ managed only | ✅ | ✅ | ✅ | ✅ |
Expand Down
2 changes: 1 addition & 1 deletion crates/ruvector-bench/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -662,7 +662,7 @@ We welcome contributions to improve the benchmarking suite!
### Areas for Contribution

- 📊 Additional benchmark scenarios (concurrent writes, updates, deletes)
- 🔌 Integration with other vector databases (Pinecone, Qdrant, Milvus)
- 🔌 Integration with other vector databases (Pinecone, LegacyDB, Milvus)
- 📈 Enhanced visualization and reporting
- 🎯 Real-world dataset support (SIFT, GIST, Deep1M loaders)
- 🚀 Performance optimization insights
Expand Down
17 changes: 4 additions & 13 deletions crates/ruvector-core/src/advanced_features/hybrid_search.rs
Original file line number Diff line number Diff line change
Expand Up @@ -175,24 +175,15 @@ impl HybridSearch {
}

/// Perform hybrid search
///
/// # Arguments
/// * `query_vector` - Query vector for semantic search
/// * `query_text` - Query text for keyword matching
/// * `k` - Number of results to return
/// * `vector_search_fn` - Function to perform vector similarity search
///
/// # Returns
/// Combined and reranked search results
pub fn search<F>(
&self,
query_vector: &[f32],
query_vector: &crate::types::QuantumVector,
query_text: &str,
k: usize,
vector_search_fn: F,
) -> Result<Vec<SearchResult>>
where
F: Fn(&[f32], usize) -> Result<Vec<SearchResult>>,
F: Fn(&crate::types::QuantumVector, usize) -> Result<Vec<SearchResult>>,
{
// Get vector similarity results
let vector_results = vector_search_fn(query_vector, k * 2)?;
Expand Down Expand Up @@ -302,10 +293,10 @@ impl HybridSearch {
/// Combined score holder
#[derive(Debug, Clone)]
struct CombinedScore {
id: VectorId,
id: crate::types::VectorId,
vector_score: Option<f32>,
keyword_score: Option<f32>,
vector: Option<Vec<f32>>,
vector: Option<crate::types::QuantumVector>,
metadata: Option<HashMap<String, serde_json::Value>>,
}

Expand Down
79 changes: 33 additions & 46 deletions crates/ruvector-core/src/advanced_features/mmr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,20 @@
//! MMR = λ × Similarity(query, doc) - (1-λ) × max Similarity(doc, selected_docs)

use crate::error::{Result, RuvectorError};
use crate::types::{DistanceMetric, SearchResult};
use serde::{Deserialize, Serialize};
use crate::types::{DistanceMetric, QuantumVector, SearchResult};

// ... (MMRConfig stays same for now, lambda is f32)

/// Configuration for MMR search
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone)]
pub struct MMRConfig {
/// Lambda parameter: balance between relevance (1.0) and diversity (0.0)
/// - λ = 1.0: Pure relevance (standard similarity search)
/// - λ = 0.5: Equal balance
/// - λ = 0.0: Pure diversity
/// Diversity weight (0.0 to 1.0)
/// Higher lambda = more weight on relevance
/// Lower lambda = more weight on diversity
pub lambda: f32,
/// Distance metric for similarity computation
/// Distance metric to use for diversity calculation
pub metric: DistanceMetric,
/// Fetch multiplier for initial candidates (fetch k * multiplier results)
/// Fetch multiplier: fetch (k * fetch_multiplier) candidates before reranking
pub fetch_multiplier: f32,
}

Expand All @@ -31,38 +31,26 @@ impl Default for MMRConfig {
}
}

/// MMR search implementation
#[derive(Debug, Clone)]
/// MMR Reranker
pub struct MMRSearch {
/// Configuration
pub config: MMRConfig,
config: MMRConfig,
}

impl MMRSearch {
/// Create a new MMR search instance
pub fn new(config: MMRConfig) -> Result<Self> {
if !(0.0..=1.0).contains(&config.lambda) {
return Err(RuvectorError::InvalidParameter(format!(
"Lambda must be in [0, 1], got {}",
config.lambda
)));
if config.lambda < 0.0 || config.lambda > 1.0 {
return Err(RuvectorError::InvalidParameter(
"MMR lambda must be between 0.0 and 1.0".to_string(),
));
}

Ok(Self { config })
}
// ... (new stays same)

/// Perform MMR-based reranking of search results
///
/// # Arguments
/// * `query` - Query vector
/// * `candidates` - Initial search results (sorted by relevance)
/// * `k` - Number of diverse results to return
///
/// # Returns
/// Reranked results optimizing for both relevance and diversity
pub fn rerank(
&self,
query: &[f32],
query: &QuantumVector,
candidates: Vec<SearchResult>,
k: usize,
) -> Result<Vec<SearchResult>> {
Expand Down Expand Up @@ -111,7 +99,7 @@ impl MMRSearch {
/// Compute MMR score for a candidate
fn compute_mmr_score(
&self,
_query: &[f32],
_query: &QuantumVector,
candidate: &SearchResult,
selected: &[SearchResult],
) -> Result<f32> {
Expand All @@ -130,7 +118,9 @@ impl MMRSearch {
.iter()
.filter_map(|s| s.vector.as_ref())
.map(|selected_vec| {
let dist = compute_distance(candidate_vec, selected_vec, self.config.metric);
let a_f32 = candidate_vec.reconstruct();
let b_f32 = selected_vec.reconstruct();
let dist = compute_distance(&a_f32, &b_f32, self.config.metric);
self.distance_to_similarity(dist)
})
.max_by(|a, b| a.partial_cmp(b).unwrap())
Expand All @@ -154,17 +144,14 @@ impl MMRSearch {
}

/// Perform end-to-end MMR search
///
/// # Arguments
/// * `query` - Query vector
/// * `k` - Number of diverse results to return
/// * `search_fn` - Function to perform initial similarity search
///
/// # Returns
/// Diverse search results
pub fn search<F>(&self, query: &[f32], k: usize, search_fn: F) -> Result<Vec<SearchResult>>
pub fn search<F>(
&self,
query: &QuantumVector,
k: usize,
search_fn: F,
) -> Result<Vec<SearchResult>>
where
F: Fn(&[f32], usize) -> Result<Vec<SearchResult>>,
F: Fn(&QuantumVector, usize) -> Result<Vec<SearchResult>>,
{
// Fetch more candidates than needed
let fetch_k = (k as f32 * self.config.fetch_multiplier).ceil() as usize;
Expand Down Expand Up @@ -225,7 +212,7 @@ mod tests {
SearchResult {
id: id.to_string(),
score,
vector: Some(vector),
vector: Some(QuantumVector::F32(vector)),
metadata: None,
}
}
Expand Down Expand Up @@ -254,7 +241,7 @@ mod tests {
};

let mmr = MMRSearch::new(config).unwrap();
let query = vec![1.0, 0.0, 0.0];
let query = QuantumVector::F32(vec![1.0, 0.0, 0.0]);

// Create candidates with varying similarity
let candidates = vec![
Expand Down Expand Up @@ -282,7 +269,7 @@ mod tests {
};

let mmr = MMRSearch::new(config).unwrap();
let query = vec![1.0, 0.0, 0.0];
let query = QuantumVector::F32(vec![1.0, 0.0, 0.0]);

let candidates = vec![
create_search_result("doc1", 0.1, vec![0.9, 0.1, 0.0]),
Expand All @@ -306,7 +293,7 @@ mod tests {
};

let mmr = MMRSearch::new(config).unwrap();
let query = vec![1.0, 0.0, 0.0];
let query = QuantumVector::F32(vec![1.0, 0.0, 0.0]);

let candidates = vec![
create_search_result("doc1", 0.1, vec![0.9, 0.1, 0.0]),
Expand All @@ -328,7 +315,7 @@ mod tests {
fn test_mmr_empty_candidates() {
let config = MMRConfig::default();
let mmr = MMRSearch::new(config).unwrap();
let query = vec![1.0, 0.0, 0.0];
let query = QuantumVector::F32(vec![1.0, 0.0, 0.0]);

let results = mmr.rerank(&query, Vec::new(), 5).unwrap();
assert!(results.is_empty());
Expand Down
Loading