Skip to content

Implement Diagnostic Fault Library with DFM, SOVD interface, and CI infrastructure#1

Merged
bburda42dot merged 7 commits intomainfrom
pr4-fault-lib-impl
Feb 25, 2026
Merged

Implement Diagnostic Fault Library with DFM, SOVD interface, and CI infrastructure#1
bburda42dot merged 7 commits intomainfrom
pr4-fault-lib-impl

Conversation

@bburda42dot
Copy link
Owner

@bburda42dot bburda42dot commented Feb 10, 2026

Summary

Complete implementation of the Diagnostic Fault Library - a Rust library for managing diagnostic fault reporting, processing, and querying in Software-Defined Vehicles. Replaces the initial scaffold (src/lib.rs, api.rs, catalog.rs, etc.) with a production-grade multi-crate workspace aligned with the S-CORE module template.

What changed

Architecture - multi-crate workspace

  • Reorganized from a single flat crate into three workspace crates:
    • common - shared types: FaultId, FaultRecord, FaultCatalog, DebounceMode, IPC service types, compliance tags
    • fault_lib - reporter-side API: Reporter with debounce filtering, enabling-condition guards, IpcWorker with retry queue (exponential backoff), LogHook observability, FaultManagerSink
    • dfm_lib - Diagnostic Fault Manager: FaultRecordProcessor, AgingManager, SovdFaultManager with KVS-backed storage, EnablingConditionRegistry, OperationCycle provider abstraction
  • Added xtask crate for developer automation
  • Deleted original scaffold files (src/lib.rs, src/api.rs, src/model.rs, src/catalog.rs, src/config.rs, src/ids.rs, src/sink.rs, src/utils.rs)

Features

  • Reporter-side debounce filtering - CountWithinWindow, HoldTime, EdgeWithCooldown, CountThreshold modes
  • Enabling conditions - E2E flow: reporters register conditions, DFM evaluates before processing
  • IPC worker - iceoryx2-based transport with bounded channel, backpressure, and retry queue with exponential backoff
  • Fault aging & reset - policy-driven aging evaluation with operation cycle integration
  • SOVD fault API - typed status, counters, ISO 8601 timestamps, full FaultId variant support (Numeric/Text/Uuid)
  • Graceful shutdown - deadlock prevention via cooperative shutdown mechanism
  • Memory safety - replaced Box::leak with Cow<str>, bounded channels

Safety & quality

  • #[deny(clippy::unwrap_used)] enforced in runtime code - all todo!(), expect(), and unwrap() replaced with proper error handling
  • Raw TODO comments replaced with documented error paths
  • Comprehensive test suite: unit tests inlined in source, integration tests (tests/integration/) covering lifecycle transitions, multi-catalog scenarios, persistent storage, and report-query flows
  • Miri-compatible for memory safety validation

CI/CD (6 new workflows)

  • build_test.yml - Cargo build + test
  • lint.yml - Clippy with deny warnings
  • format.yml - rustfmt check
  • coverage.yml - Code coverage reporting
  • miri.yml - Memory safety checks
  • copyright.yml - License header validation

All workflows aligned with S-CORE patterns.

Project structure alignment

  • .bazelrc, MODULE.bazel, BUILD files for Bazel 8 support
  • .vscode/settings.json and extensions.json for development environment
  • .ruff.toml, .yamlfmt, rustfmt.toml for formatting consistency
  • Updated README.md with architecture overview, getting started, and examples
  • Issue and PR templates added

@bburda42dot bburda42dot self-assigned this Feb 10, 2026
@bburda42dot bburda42dot force-pushed the pr4-fault-lib-impl branch 2 times, most recently from c4eb6d7 to dd8e06c Compare February 16, 2026 13:45
@bburda42dot bburda42dot changed the title Pr4 fault lib impl Implement Diagnostic Fault Library with DFM, SOVD interface, and CI infrastructure Feb 17, 2026
@bburda42dot bburda42dot marked this pull request as ready for review February 17, 2026 12:50
@bburda42dot bburda42dot force-pushed the pr4-fault-lib-impl branch 2 times, most recently from 07b71e0 to 8ad4b82 Compare February 17, 2026 13:22
mfaferek93

This comment was marked as outdated.

Copy link

@mfaferek93 mfaferek93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few findings, but overall looks good

Copy link

@mfaferek93 mfaferek93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minors/nits

Copy link

@mfaferek93 mfaferek93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments

bburda42dot added a commit that referenced this pull request Feb 25, 2026
- Extract SovdFaultState::record_occurrence() helper to deduplicate
  occurrence counter code in Failed/PreFailed arms (#9)
- Add hour/minute/second validation to parse_iso_timestamp (eclipse-opensovd#5)
- Add IpcDuration::validate() for IPC trust boundary checks (#1, eclipse-opensovd#2)
- Add Hash derive to IpcTimestamp, PartialEq to FaultRecord (#17)
- Fix catalog_and_reporter example to use real catalog JSON (#24)
- Fix README run commands with correct -p and --example flags (#25)
- Remove duplicate FaultDescriptor from common module doc (#16)
- Add permissions: contents: read to all CI workflows (#7)
- Pin cargo-audit install to taiki-e/install-action SHA (#15)
- Align MODULE.bazel version to 0.0.1 matching Cargo.toml (#21)
- Change query_conversion, query_server, fault_lib_communicator
  to pub(crate) in dfm_lib (#27)
- Remove dead build:loom xtask command (#20)
- Fix delete/clear doc comments in DfmQueryRequest (#23)
- Add enabling conditions design doc note (#26)
- Replace fixed 100ms sleep with retry loop in IPC test (#19)
Migrate from single-crate layout to multi-crate workspace with
Bazel 8.3 + Cargo dual build system. Add xtask runner for common
development commands.
IPC-safe types (IpcDuration, IpcTimestamp), fault descriptors,
catalog configuration, debounce/enabling condition config,
query protocol definitions, and iceoryx2 service types.
Fault reporter API, IPC worker with exponential backoff retry,
fault catalog validation, enabling condition management, and
FaultManagerSink for iceoryx2 transport.
SOVD-compliant fault manager with KVS persistent storage, aging
manager, operation cycle tracking, fault record processor, and
query server with iceoryx2 IPC transport.
E2E tests covering lifecycle transitions, debounce/aging/cycles,
persistent storage, concurrent access, boundary values, error
paths, multi-catalog, JSON catalog loading, IPC query/clear,
and report-and-query flow.
Copy link

@mfaferek93 mfaferek93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Workflows: build/test, clippy lint, rustfmt, miri, coverage,
copyright header check, cargo audit (pinned to SHA), Bazel
format check. All workflows set permissions: contents: read.
…rence

Architecture overview, fault catalog/reporter/DFM sequence
diagrams, library architecture drawing, Sphinx docs scaffold,
and HVAC component design reference example.
@bburda42dot bburda42dot merged commit 5152938 into main Feb 25, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants