Skip to content

Latest commit

Β 

History

History
151 lines (113 loc) Β· 5.37 KB

File metadata and controls

151 lines (113 loc) Β· 5.37 KB


Logo

DBMS extension for multimodal query processing and optimization.
Explore the docs Β»

Landing Page | Report Bug | Request Feature

Table of Contents

  1. About The Project
  2. Features
  3. Getting Started
  4. Usage
  5. Roadmap
  6. Feedback and Issues
  7. License
  8. Acknowledgments

πŸ“œ About The Project

Flock is an advanced DuckDB extension that seamlessly integrates analytics with semantic analysis through declarative SQL queries. Designed for modern data analysis needs, Flock empowers users to work with structured and unstructured data, combining OLAP workflows with the capabilities of LLMs (Large Language Models) and RAG (Retrieval-Augmented Generation) pipelines.

To cite the project:

@article{10.14778/3750601.3750685,
  author  = {Dorbani, Anas and Yasser, Sunny and Lin, Jimmy and Mhedhbi, Amine},
  title   = {Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB},
  journal = {Proc. VLDB Endow.},
  year    = {2025},
  volume  = {18},
  number  = {12},
  doi     = {10.14778/3750601.3750685},
  url     = {https://doi.org/10.14778/3750601.3750685}
}

πŸ” back to top

πŸ”₯ Features

  • Declarative SQL Interface: Perform text generation, classification, summarization, filtering, and embedding generation using SQL queries.
  • Multi-Provider Support: Easily integrate with OpenAI, Azure, and Ollama for your AI needs.
  • End-to-End RAG Pipelines: Enable retrieval and augmentation workflows for enhanced analytics.
  • Map and Reduce Functions: Intuitive APIs for combining semantic tasks and data analytics directly in DuckDB.

πŸ” back to top

πŸš€ Getting Started

πŸ“ Prerequisites

  1. DuckDB: Version 1.1.1 or later. Install it from the official DuckDB installation guide.
  2. Supported Providers: Ensure you have credentials or API keys for at least one of the supported providers:
    • OpenAI
    • Azure
    • Ollama
  3. Supported OS:
    • Linux
    • macOS
    • Windows

πŸ” back to top

βš™οΈ Installation

Flock is a Community Extension available directly from DuckDB's community catalog.

  1. Install the extension:
    INSTALL flock FROM community;
  2. Load the extension:
    LOAD flock;

πŸ” back to top

πŸ’» Usage

πŸ”§ Example Query

Using Flock, you can run semantic analysis tasks directly in DuckDB. For example:

SELECT llm_complete(
            { 'model_name': 'summarizer'},
            { 'prompt_name': 'description-generation', 'context_columns': [{ 'data': product_name }]}
       ) AS product_description
  FROM UNNEST(['Wireless Headphones', 'Gaming Laptop', 'Smart Watch']) AS t(product_name);

Explore more usage examples in the documentation.

πŸ” back to top

πŸ›£οΈ Roadmap

Our roadmap outlines upcoming features and improvements. Stay updated by checking out our detailed plan.

πŸ” back to top

πŸ› οΈ Feedback and Issues

We value your feedback! If you’d like to report an issue or suggest a new feature, please use the links below:

For contributing code or other contributions, please refer to our dedicated Contribution Guidelines.

πŸ” back to top

πŸ“ License

This project is licensed under the MIT License. See the LICENSE file for details.

πŸ” back to top

✨ Team

This project is under active development by the Data & AI Systems Laboratory (DAIS Lab) at Polytechnique MontrΓ©al.

πŸ” back to top