IntelliStream

IntelliStream Research Group

专注于流处理、AI系统与智能数据库的研究与开发

Focused on Stream Processing, AI Systems, and Intelligent Databases

🌟 SAGE 项目生态系统 | SAGE Project Ecosystem

SAGE (Streaming-Augmented Generative Execution) 是一个高性能、模块化的 AI 推理框架生态系统，通过数据流抽象实现透明、可扩展的 LLM 驱动系统。

SAGE is a high-performance, modular AI inference framework ecosystem that enables transparent, scalable LLM-powered systems through dataflow abstractions.

📦 核心仓库 | Core Repositories

🎯 SAGE

主框架 | Main Framework

声明式、可组合的流式增强生成执行框架，用于通过数据流抽象构建透明的 LLM 驱动系统。

A declarative, composable framework for building transparent LLM-powered systems through dataflow abstractions.

特性 | Features:

⚡ 生产就绪的企业级应用
🔧 直观的声明式 API
🚀 高吞吐量流式工作负载优化
👁️ 内置可观测性和调试工具

📚 SAGE-Pub

文档中心 | Documentation Hub

SAGE 系统的官方对外文档仓库，包含快速开始、架构图、API 文档等。

Official public documentation repository for the SAGE system, including quick start guides, architecture diagrams, and API documentation.

内容 | Contents:

📘 快速开始指南
🏗️ 架构与核心模块说明
📊 Dashboard 使用指南
🔗 API 文档

🔧 数据库与系统组件 | Database & System Components

💾 sageVDB

向量数据库核心 | Vector Database Core

高性能向量数据库 C++ 核心库，支持可插拔 ANNS 架构和多模态特性。

High-performance C++20 vector database library with pluggable ANNS architecture and multimodal support.

🌊 sageFlow

向量流处理引擎 | Vector Stream Processing Engine

向量原生流处理引擎，专为实时 LLM 生成任务维护和物化语义状态快照而设计。

Vector-native stream processing engine designed to maintain and materialize semantic state snapshots for real-time LLM generation tasks.

⏱️ sageTSDB

时序数据库 | Time Series Database

SAGE 生态系统的时序数据库组件，用于处理时间序列数据。

Time series database component of the SAGE ecosystem for handling temporal data streams.

📊 sageData

基准数据集 | Benchmark Datasets

SAGE 基准测试的共享数据集和资源库。

Shared test datasets and resources for SAGE benchmarks.

🤏 sageRefiner

RAG 上下文压缩 | RAG Context Compression

SAGE 生态系统的上下文压缩组件，用于优化 RAG 应用的输入长度。

Context compression component for the SAGE ecosystem, optimizing input length for RAG applications.

🔒 sageFlownet

流式网络堆栈 | Streaming Network Stack

高性能流式网络通信堆栈。

High-performance streaming network communication stack.

🤖 AI 与智能体组件 | AI & Agent Components

🧠 sageLLM

国产算力 LLM 推理引擎 | LLM Inference Engine

面向华为昇腾与 NVIDIA 的模块化 LLM 推理引擎，默认 CPU 优先，提供统一的 Python/HTTP 接口。 (See dedicated section below for sub-modules)

Modular LLM inference engine for domestic computing power, CPU-first with unified APIs.

🕵️ sage-agentic

SAGE 智能体框架 | SAGE Agentic Framework

智能体工具选择、规划、工作流与多智能体协作框架。

Tool selection, planning, workflows, and agent coordination framework.

🧩 neuromem

记忆管理引擎 | Memory Management Engine

SAGE 项目的记忆体组件，RAG 应用的独立内存管理引擎。

Standalone memory management engine for RAG applications.

🔒 sage-rag

SAGE RAG 框架 | SAGE RAG Framework

RAG 流水线的文档加载、分块与检索框架。

Document loaders, chunkers, and retrievers for RAG pipelines.

🎯 sage-intent

意图识别 | Intent Recognition

基于关键词和大模型的对话 AI 意图分类工具。

Keyword and LLM-based intent classification for conversational AI.

🔧 sage-finetune

轻量微调工具 | Lightweight Fine-tuning

SAGE 生态系统的 LLM 轻量级微调工具箱。

Lightweight LLM fine-tuning toolkit for the SAGE ecosystem.

🔒 sage-safety

安全框架 | Safety Framework

AI 系统的安全护栏与检测器。

Safety guardrails and detectors for AI systems.

🔒 sage-privacy

隐私保护 | Privacy Protection

机器学习遗忘与差分隐私工具。

Machine unlearning and differential privacy tools.

📖 sage-examples

示例代码库 | Examples Repository

SAGE 框架的示例代码和使用案例集合。

Collection of example code and use cases for the SAGE framework.

📊 评测与基准 | Evaluation & Benchmarks

📉 sage-benchmark

SAGE 基准测试套件 | SAGE Benchmark Suite

全面的 AI 数据处理管道评估框架。

Comprehensive evaluation framework for AI data processing pipelines.

🤖 sage-benchmark-agent

智能体评测代理 | SAGE Benchmark Agent

配置驱动的智能体能力评估框架（工具选择、规划、时序检测）。

Configuration-driven framework for evaluating agent capabilities.

🧪 CANDOR-Bench

数据库基准测试 | Database Benchmark [SIGMOD'26]

SAGE 数据库组件的性能基准测试套件。

Performance benchmark suite for SAGE database components.

🔒 sage-eval

评估框架 | Evaluation Framework

AI 系统的指标、Profiler 与评审工具（Judges）。

Metrics, profilers, and judges for AI systems.

🔢 LibAMM

近似矩阵乘法基准测试 | AMM Benchmark Library [NIPS'24]

聚合了主流 AMM 算法的高性能基准测试库，支持标准化评估和高效实验管理。

High-performance benchmark library aggregating prevalent AMM algorithms for standardized evaluations.

特性 | Features:

🚀 高性能 C++ 实现
🐍 Python 绑定 (PyAMM)
🔥 可选 CUDA 加速支持
📊 PAPI 性能分析工具

🧮 算法库 | Algorithm Libraries

⚡ Concurrent-HNSW

并发 HNSW 库 | Concurrent HNSW Library

支持并发操作的 HNSW 实现，提供快速并发的近似最近邻搜索。

Header-only C++/Python library for fast and concurrent approximate nearest neighbor search.

状态 | Status: 🚧 开发中 | In Development

🔍 sage-anns

近似最近邻搜索算法库 | ANNS Algorithm Library

提供统一 Python 接口的近似最近邻搜索算法集合，支持多种 ANNS 算法。

SAGE ANNS: Approximate Nearest Neighbor Search algorithms with unified Python interface.

✖️ sage-amms

近似矩阵乘法算法 | AMM Algorithms

近似矩阵乘法算法的 C++ 实现集合。

Approximate Matrix Multiplication algorithms with C++ implementations.

🎯 sage-sias

样本选择算法 | Sample Selection(SIAS)

用于持续学习和核心集算法的样本重要性感知选择。

SIAS - Sample-Importance-Aware Selection for continual learning and coreset algorithms.

🧠 sageLLM 模块架构 | sageLLM Modular Architecture

The modular ecosystem behind the sageLLM inference engine.

🔒 sagellm-protocol

基础协议 | Protocol & Foundations

定义推理引擎的 Schema、Error Codes 和基础类型 (Task0.1)。

Protocol definitions and types for sageLLM inference engine.

🔒 sagellm-core

引擎核心 | Engine Core

推理引擎的核心运行时与执行逻辑 (Task0)。

Core engine and runtime for sageLLM inference.

🔒 sagellm-backend

计算后端 | Compute Backend

面向国产硬件（华为昇腾 / CPU）的计算抽象层 (Task0)。

Backend provider abstraction for domestic hardware.

🔒 sagellm-comm

通信层 | Communication Layer

分布式推理的通信硬件抽象层与拓扑管理 (Task1)。

Communication layer for distributed inference.

🔒 sagellm-kv-cache

KV 缓存 | KV Cache Management

KV 缓存池、前缀缓存与驱逐策略管理 (Task2)。

KV cache management with prefix caching and eviction.

🔒 sagellm-control-plane

控制面 | Control Plane

请求路由、调度器 IR 与生命周期管理。

Request routing, scheduling, and lifecycle management.

🔒 sagellm-gateway

API 网关 | API Gateway

OpenAI 兼容的 REST API 网关。

OpenAI-compatible REST API gateway.

🔒 sagellm-compression

模型压缩 | Model Compression

量化、稀疏化与投机解码加速技术 (Task3)。

Model compression and acceleration techniques.

🔒 sagellm-benchmark

E2E 验证 | E2E Validation

端到端演示运行器与年度验证套件。

E2E demo runner and Year 1/2/3 validation.

🔒 sagellm-docs

文档 | Documentation

内部任务书、规范与研究文档。

Internal task books, specifications, and research docs.

�️ 工具与基础设施 | Tools & Infrastructure

📦 sage-pypi-publisher

PyPI 发布工具 | PyPI Publisher Toolkit

Python monorepos 的字节码编译与 PyPI 发布工具。

Bytecode compiler and PyPI publisher toolkit for Python monorepos.

🌐 sage-edge

SAGE 网关聚合器 | SAGE Gateway Aggregator

轻量级 FastAPI 网关聚合器，为 SAGE 提供统一的 API 入口。

Lightweight FastAPI aggregator for SAGE Gateway.

🐙 sage-github-manager

GitHub 问题管理工具 | GitHub Issues Manager

SAGE 项目的 GitHub Issues 管理工具，具有 AI 增强功能。

A comprehensive GitHub Issues management tool for SAGE project with AI-powered features.

🎨 sage-studio

可视化工作流 | Visual Workflow

SAGE AI 流水线的可视化构建器与 LLM Playground。

Visual workflow builder and LLM playground for SAGE AI pipelines.

🔒 sage-team-info

团队信息 | Team Info

SAGE 项目人员分配和敏感信息。

Internal team allocation and sensitive information.

�🗄️ 历史仓库 | Historical Repositories

sage-db_outdated - SAGE 数据库的早期版本（已过时）| Early version of SAGE database (outdated)

🚀 其他研究项目 | Other Research Projects

流处理系统 | Stream Processing Systems

MorphStream ⭐ 141 - [ICDE'20, SIGMOD'23, TKDE'24] 可扩展的事务性流处理引擎 | Scalable transactional stream processing engine
AllianceDB ⭐ 16 - [SIGMOD'21] 并行数据库系统 | Parallel database system

基准测试与工具 | Benchmarks & Tools

Sesame ⭐ 26 - [SIGMOD'23] 数据流聚类实证研究 | Data stream clustering empirical study
PDSC - 并行数据流聚类基准 | Parallel data stream clustering benchmark

机器学习与AI | Machine Learning & AI

SentiStream ⭐ 7 - [EMENLP'23] 情感分析流处理 | Sentiment analysis stream processing
StreamLearning - 流式学习框架 | Stream learning framework

资源与文档 | Resources & Documentation

StreamProcessing_ReadingList ⭐ 69 - 流处理文献阅读列表 | Stream processing reading list
Awesome-Online-Continual-Learning - 在线持续学习资源 | Online continual learning resources

📖 快速开始 | Quick Start

安装 SAGE | Install SAGE

# 标准安装 | Standard installation (recommended)
pip install isage[standard]

# 核心安装 | Core installation only
pip install isage[core]

简单示例 | Simple Example

from sage.kernel.api.local_environment import LocalEnvironment
from sage.libs.io.source import FileSource
from sage.middleware.operators.rag import DenseRetriever, QAPromptor, OpenAIGenerator
from sage.libs.io.sink import TerminalSink

# 创建执行环境 | Create execution environment
env = LocalEnvironment("rag_pipeline")

# 构建声明式管道 | Build declarative pipeline
(
    env.from_source(FileSource, {"file_path": "questions.txt"})
    .map(DenseRetriever, {"model": "sentence-transformers/all-MiniLM-L6-v2"})
    .map(QAPromptor, {"template": "Answer based on: {context}\nQ: {query}\nA:"})
    .map(OpenAIGenerator, {"model": "gpt-3.5-turbo"})
    .sink(TerminalSink)
)

# 执行管道 | Execute pipeline
env.submit()

详细文档请访问：SAGE Documentation

For detailed documentation, visit: SAGE Documentation

🤝 参与贡献 | Contributing

我们欢迎各种形式的贡献！请查看各个仓库的 CONTRIBUTING.md 文件了解详情。

We welcome contributions of all kinds! Please check the CONTRIBUTING.md file in each repository for details.

📞 联系我们 | Contact Us

💬 WeChat/微信群: 加入微信群
💬 QQ群: IntelliStream课题组讨论QQ群
💬 Slack: Join our Slack
🌐 Website: intellistream.github.io

📄 许可证 | License

各项目许可证详见各仓库的 LICENSE 文件。大多数项目采用 MIT 或 Apache 2.0 许可证。

License details can be found in each repository's LICENSE file. Most projects use MIT or Apache 2.0 licenses.

⭐ 如果我们的项目对您有帮助，请给我们一个 Star！

If our projects help you, please give us a Star!

IntelliStream Research Group

🌟 SAGE 项目生态系统 | SAGE Project Ecosystem

📦 核心仓库 | Core Repositories

🎯 SAGE

📚 SAGE-Pub

🔧 数据库与系统组件 | Database & System Components

💾 sageVDB

🌊 sageFlow

⏱️ sageTSDB

📊 sageData

🤏 sageRefiner

🔒 sageFlownet

🤖 AI 与智能体组件 | AI & Agent Components

🧠 sageLLM

🕵️ sage-agentic

🧩 neuromem

🔒 sage-rag

🎯 sage-intent

🔧 sage-finetune

🔒 sage-safety

🔒 sage-privacy

📖 sage-examples

📊 评测与基准 | Evaluation & Benchmarks

📉 sage-benchmark

🤖 sage-benchmark-agent

🧪 CANDOR-Bench

🔒 sage-eval

🔢 LibAMM

🧮 算法库 | Algorithm Libraries

⚡ Concurrent-HNSW

🔍 sage-anns

✖️ sage-amms

🎯 sage-sias

🧠 sageLLM 模块架构 | sageLLM Modular Architecture

🔒 sagellm-protocol

🔒 sagellm-core

🔒 sagellm-backend

🔒 sagellm-comm

🔒 sagellm-kv-cache

🔒 sagellm-control-plane

🔒 sagellm-gateway

🔒 sagellm-compression

🔒 sagellm-benchmark

🔒 sagellm-docs

�️ 工具与基础设施 | Tools & Infrastructure

📦 sage-pypi-publisher

🌐 sage-edge

🐙 sage-github-manager

🎨 sage-studio

🔒 sage-team-info

�🗄️ 历史仓库 | Historical Repositories

🚀 其他研究项目 | Other Research Projects

流处理系统 | Stream Processing Systems

基准测试与工具 | Benchmarks & Tools

机器学习与AI | Machine Learning & AI

资源与文档 | Resources & Documentation

📖 快速开始 | Quick Start

安装 SAGE | Install SAGE

简单示例 | Simple Example

🤝 参与贡献 | Contributing

📞 联系我们 | Contact Us

📄 许可证 | License

Popular repositories Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!