Skip to content
@IntelligentDDS

IntelligentDDS

IntelligentDDS

Welcome to IntelligentDDS! We are an open-source research group. Our work focuses on leveraging AI, Machine Learning, and Data Mining to solve reliability, performance, and scalability challenges in modern distributed systems.

🔍 Project Index & Classification

1. Anomaly Detection (AD)

Frameworks that utilize deep learning and statistical methods to detect irregularities in metrics, logs, and traces.

  • SwissLog: A deep learning-based framework for log-based anomaly detection (ISSRE'20 / TDSC'22).
  • ShareAD: A "Pre-train and Align" framework for modern online system anomaly detection (TOSEM 2025).
  • Uni-AD: Unified approaches for multi-dimensional metric and topology-aware detection.

2. Root Cause Analysis (RCA) & Incident Management

Tools for localizing failure origins and managing the lifecycle of system incidents.

  • MicroRank: End-to-end latency fault localization in microservices using extended spectrum analysis (WWW'21).
  • GEM: An evolution-aware framework treating subgraphs as first-class citizens for incident management (TSE 2025).
  • ProfRCA: Fine-grained RCA combining continuous profiling data with Large Language Models (LLMs) (SANER 2026).

3. Observability & Data Optimization

Optimizing the collection, storage, and processing of high-volume telemetry data.

  • LogReducer: High-efficiency log compression and redundancy elimination.
  • LogShrink: Specialized tools for reducing log volume while preserving diagnostic utility.

4. Advanced Log Intelligence (LogGen, LogFun, LogBoost)

This specialized suite focuses on the lifecycle of system logs—from generation and parsing to quality enhancement for downstream AIOps tasks.

  • LogGen: An automated log generation framework designed to synthesize realistic logs for testing and training without compromising sensitive data (JSS 2025).
  • LogFun: A logic-level log parsing approach that focuses on the functional structure of log messages to improve template extraction accuracy.
  • LogBoost: A framework designed to "boost" the quality of raw logs, making them more suitable for automated anomaly detection and root cause analysis (TSC 2025).

5. Benchmarks & Resources

  • Augmented-TrainTicket: An enhanced version of the TrainTicket microservice benchmark with advanced fault injection.
  • Awesome-Papers: A curated list of top-tier papers (ICSE, FSE, ASE, ISSTA, etc.) regarding Cloud Computing and AIOps.

🛠 Tech Stack

  • Languages: Python, Go, Java, C#
  • Infrastructure: Kubernetes, Prometheus, Istio, gRPC
  • AI Techniques: Large Language Models (LLMs), Graph Neural Networks (GNN), Transformer, Reinforcement Learning (RL)

Pinned Loading

  1. MicroRank MicroRank Public

    MicroRank: End-to-End Latency Issue Localization with Extended Spectrum Analysis in Microservice Environments

    Python 39 8

  2. SwissLog SwissLog Public

    The implementation of SwissLog in ISSRE'20 and TDSC'22

    Python 58 6

  3. LogReducer LogReducer Public

    Python 15 2

  4. ShareAD ShareAD Public

    On the Practicability of Deep Learning based Anomaly Detection for Modern Online Software Systems: A Pre-Train-and-Align Framework (TOSEM 2025)

    Python 4

  5. GEM GEM Public

    Official code for "Subgraphs as First-Class Citizens in Incident Management for Large-Scale Online Systems: An Evolution-Aware Framework" (TSE 2025)

    Python 3

  6. awesome-papers awesome-papers Public

    Awesome-papers is a collection of awesome papers about cloud computing including resource management, serverless, microservice, observerbility and so on.

    127 15

Repositories

Showing 10 of 44 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…