-
Notifications
You must be signed in to change notification settings - Fork 342
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
Published @ruvector/ruvllm-wasm@2.0.0 to npm with compiled WASM binaries, replacing the deprecated v0.1.0 placeholder.
What's Included
- KV Cache — Two-tier (FP32 tail + u8 quantized store) token management
- Memory Pooling — Arena allocator + buffer pool for minimal allocation overhead
- Chat Templates — Llama3, Mistral, Qwen, ChatML, Phi, Gemma format support
- HNSW Semantic Router — Bidirectional graph with cosine similarity
- MicroLoRA — Per-request adaptation (rank 1-4)
- SONA Instant — EMA quality tracking + adaptive rank
- Web Workers — Parallel inference with SharedArrayBuffer detection
- TypeScript — Complete
.d.tstype definitions
Build Notes
Rust 1.91 has a WASM codegen bug (type mismatch in release profile). Workaround:
CARGO_PROFILE_RELEASE_CODEGEN_UNITS=256 CARGO_PROFILE_RELEASE_LTO=off \
wasm-pack build crates/ruvllm-wasm --target web --scope ruvector --releaseNot Yet Wired
-
IntelligentLLMWasm(combined router+lora+sona) — commented out, needs API alignment - WebGPU attention shader (matmul works, attention falls back to CPU)
- GGUF model streaming loader
- Worker pool proper task completion (uses setTimeout polling)
Related
- PR fix: resolve 5 P0 critical issues + pre-existing compile errors #239 — Deprecated
@ruvector/ruvllm-wasm@0.1.0placeholder - ADR-084 — ruvllm-wasm publish documentation
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request