From 3d9151340091b6bc5d72effa80584b1f73f7350c Mon Sep 17 00:00:00 2001
From: Ankush Malaker <43288948+AnkushMalaker@users.noreply.github.com>
Date: Thu, 19 Feb 2026 22:44:30 +0530
Subject: [PATCH 1/2] feat: add one line install helper (#305)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* audio upload extension with gdrive credentials

* FIX: API parameters

* UPDATE: tmp files cleanup n code refactored as per review

* REFACTOR: minor refactor as per review

* REFACTOR: minor update as per review

* UPDATE: gdrive sync logic

* REFACTOR: code update as per gdrive and update credential client

* REFACTOR: validation updated - as per review from CR

* UPDATE: code has been refactore for UUID for diffrent audio upload sources

* REFACTOR: updated code as per review

* Update documentation and configuration to reflect the transition from 'friend-backend' to 'chronicle-backend' across various files, including setup instructions, Docker configurations, and service logs.

* Update test script to use docker-compose-test.yml for all test-related operations

* Added standard MIT license

* Fix/cleanup model (#219)

* refactor memory

* add config

* docstring

* more cleanup

* code quality

* code quality

* unused return

* DOTTED GET

* Refactor Docker and CI configurations

- Removed the creation of `memory_config.yaml` from the CI workflow to streamline the process.
- Updated Docker Compose files to mount `config.yml` for model registry and memory settings in both services.
- Added new dependencies for Google API clients in `uv.lock` to support upcoming features.

* Update configuration files for model providers and Docker setup

- Changed LLM, embedding, and STT providers in `config.yml` to OpenAI and Deepgram.
- Removed read-only flag from `config.yml` in Docker Compose files to allow UI configuration saving.
- Updated memory configuration endpoint to accept plain text for YAML input.

* Update transcription job handling to format speaker IDs

- Changed variable name from `speaker_name` to `speaker_id` for clarity.
- Added logic to convert integer speaker IDs from Deepgram to string format for consistent speaker labeling.

* Remove loading of backend .env file in test environment setup

- Eliminated the code that loads the .env file from the backends/advanced directory, simplifying the environment configuration for tests.

* Enhance configuration management and setup wizard

- Updated README to reflect the new setup wizard process.
- Added functionality to load and save `config.yml` in the setup wizard, including default configurations for LLM and memory providers.
- Improved user feedback during configuration updates, including success messages for configuration file updates.
- Enabled backup of existing `config.yml` before saving changes.

* Enhance HTTPS configuration in setup wizard

- Added functionality to check for existing SERVER_IP in the environment file and prompt the user to reuse or enter a new IP for SSL certificates.
- Improved user prompts for server IP/domain input during HTTPS setup.
- Updated default behavior to use existing IP or localhost based on user input.
- Changed RECORD_ONLY_ENROLLED_SPEAKERS setting in the .env template to false for broader access.

* Add source parameter to audio file writing in websocket controller

- Included a new `source` parameter with the value "websocket" in the `_process_batch_audio_complete` function to enhance audio file context tracking.

---------

Co-authored-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

* fix/broken-tests (#230)

* refactor memory

* add config

* docstring

* more cleanup

* code quality

* code quality

* unused return

* DOTTED GET

* Refactor Docker and CI configurations

- Removed the creation of `memory_config.yaml` from the CI workflow to streamline the process.
- Updated Docker Compose files to mount `config.yml` for model registry and memory settings in both services.
- Added new dependencies for Google API clients in `uv.lock` to support upcoming features.

* Update configuration files for model providers and Docker setup

- Changed LLM, embedding, and STT providers in `config.yml` to OpenAI and Deepgram.
- Removed read-only flag from `config.yml` in Docker Compose files to allow UI configuration saving.
- Updated memory configuration endpoint to accept plain text for YAML input.

* Update transcription job handling to format speaker IDs

- Changed variable name from `speaker_name` to `speaker_id` for clarity.
- Added logic to convert integer speaker IDs from Deepgram to string format for consistent speaker labeling.

* Remove loading of backend .env file in test environment setup

- Eliminated the code that loads the .env file from the backends/advanced directory, simplifying the environment configuration for tests.

* Enhance configuration management and setup wizard

- Updated README to reflect the new setup wizard process.
- Added functionality to load and save `config.yml` in the setup wizard, including default configurations for LLM and memory providers.
- Improved user feedback during configuration updates, including success messages for configuration file updates.
- Enabled backup of existing `config.yml` before saving changes.

* Enhance HTTPS configuration in setup wizard

- Added functionality to check for existing SERVER_IP in the environment file and prompt the user to reuse or enter a new IP for SSL certificates.
- Improved user prompts for server IP/domain input during HTTPS setup.
- Updated default behavior to use existing IP or localhost based on user input.
- Changed RECORD_ONLY_ENROLLED_SPEAKERS setting in the .env template to false for broader access.

* Add source parameter to audio file writing in websocket controller

- Included a new `source` parameter with the value "websocket" in the `_process_batch_audio_complete` function to enhance audio file context tracking.

* Refactor error handling in system controller and update memory config routes

- Replaced ValueError with HTTPException for better error handling in `save_diarization_settings` and `validate_memory_config` functions.
- Introduced a new Pydantic model, `MemoryConfigRequest`, for validating memory configuration requests in the system routes.
- Updated the `validate_memory_config` endpoint to accept the new request model, improving input handling and validation.

---------

Co-authored-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

* Feat/add obsidian 3 (#233)

* obsidian support

* neo4j comment

* cleanup code

* unused line

* unused line

* Fix MemoryEntry object usage in chat service

* comment

* feat(obsidian): add obsidian memory search integration to chat

* unit test

* use rq

* neo4j service

* typefix

* test fix

* cleanup

* cleanup

* version changes

* profile

* remove unused imports

* Refactor memory configuration validation endpoints

- Removed the deprecated `validate_memory_config_raw` endpoint and replaced it with a new endpoint that accepts plain text for validation.
- Updated the existing `validate_memory_config` endpoint to clarify that it now accepts JSON input.
- Adjusted the API call in the frontend to point to the new validation endpoint.

* Refactor health check model configuration loading

- Updated the health check function to load model configuration from the models registry instead of the root config.
- Improved error handling by logging warnings when model configuration loading fails.

---------

Co-authored-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

* Update .gitignore to exclude all files in app/ios and app/android directories (#238)

* fix: Copy full source code in speaker-recognition Dockerfile (#243)

Adds COPY src/ src/ step after dependency installation to ensure
all source files are available in the Docker image. This improves
build caching while ensuring complete source code is present.

* Enhance configuration management and add new setup scripts (#235)

* Enhance configuration management and add new setup scripts

- Updated .gitignore to include config.yml and its template.
- Added config.yml.template for default configuration settings.
- Introduced restart.sh script for service management.
- Enhanced services.py to load config.yml and check for Obsidian/Neo4j integration.
- Updated wizard.py to prompt for Obsidian/Neo4j configuration during setup and create config.yml from template if it doesn't exist.

* Refactor transcription providers and enhance configuration management

- Updated Docker Compose files to include the new Neo4j service configuration.
- Added support for Obsidian/Neo4j integration in the setup process.
- Refactored transcription providers to utilize a registry-driven approach for Deepgram and Parakeet.
- Enhanced error handling and logging in transcription processes.
- Improved environment variable management in test scripts to prioritize command-line overrides.
- Removed deprecated Parakeet provider implementation and streamlined audio stream workers.

* Update configuration management and enhance file structure, add test-matrix (#237)

* Update configuration management and enhance file structure

- Refactored configuration file paths to use a dedicated `config/` directory, including updates to `config.yml` and its template.
- Modified service scripts to load the new configuration path for `config.yml`.
- Enhanced `.gitignore` to include the new configuration files and templates.
- Updated documentation to reflect changes in configuration file locations and usage.
- Improved setup scripts to ensure proper creation and management of configuration files.
- Added new test configurations for various provider combinations to streamline testing processes.

* Add test requirements and clean up imports in wizard.py

- Introduced a new `test-requirements.txt` file to manage testing dependencies.
- Removed redundant import of `shutil` in `wizard.py` to improve code clarity.

* Add ConfigManager for unified configuration management

- Introduced a new `config_manager.py` module to handle reading and writing configurations from `config.yml` and `.env` files, ensuring backward compatibility.
- Refactored `ChronicleSetup` in `backends/advanced/init.py` to utilize `ConfigManager` for loading and updating configurations, simplifying the setup process.
- Removed redundant methods for loading and saving `config.yml` directly in `ChronicleSetup`, as these are now managed by `ConfigManager`.
- Enhanced user feedback during configuration updates, including success messages for changes made to configuration files.

* Refactor transcription provider configuration and enhance setup process

- Updated `.env.template` to clarify speech-to-text configuration and removed deprecated options for Mistral.
- Modified `docker-compose.yml` to streamline environment variable management by removing unused Mistral keys.
- Enhanced `ChronicleSetup` in `init.py` to provide clearer user feedback and updated the transcription provider selection process to rely on `config.yml`.
- Improved error handling in the websocket controller to determine the transcription provider from the model registry instead of environment variables.
- Updated health check routes to reflect the new method of retrieving the transcription provider from `config.yml`.
- Adjusted `config.yml.template` to include comments on transcription provider options for better user guidance.

* Enhance ConfigManager with deep merge functionality

- Updated the `update_memory_config` method to perform a deep merge of updates into the memory configuration, ensuring nested dictionaries are merged correctly.
- Added a new `_deep_merge` method to handle recursive merging of dictionaries, improving configuration management capabilities.

* Refactor run-test.sh and enhance memory extraction tests

- Removed deprecated environment variable handling for TRANSCRIPTION_PROVIDER in `run-test.sh`, streamlining the configuration process.
- Introduced a new `run-custom.sh` script for executing Robot tests with custom configurations, improving test flexibility.
- Enhanced memory extraction tests in `audio_keywords.robot` and `memory_keywords.robot` to include detailed assertions and result handling.
- Updated `queue_keywords.robot` to fail fast if a job is in a 'failed' state when expecting 'completed', improving error handling.
- Refactored `test_env.py` to load environment variables with correct precedence, ensuring better configuration management.

* unify tests to robot test, add some more clean up

* Update health check configuration in docker-compose-test.yml (#241)

- Increased the number of retries from 5 to 10 for improved resilience during service readiness checks.
- Extended the start period from 30s to 60s to allow more time for services to initialize before health checks commence.

* Add step to create test configuration file in robot-tests.yml

- Introduced a new step in the GitHub Actions workflow to copy the test configuration file from tests/configs/deepgram-openai.yml to a new config/config.yml.
- Added logging to confirm the creation of the test config file, improving visibility during the test setup process.

* remove cache step since not required

* coderabbit comments

* Refactor ConfigManager error handling for configuration file loading

- Updated the ConfigManager to raise RuntimeError exceptions when the configuration file is not found or is invalid, improving error visibility and user guidance.
- Removed fallback behavior that previously returned the current directory, ensuring users are explicitly informed about missing or invalid configuration files.

* Refactor _find_repo_root method in ConfigManager

- Updated the _find_repo_root method to locate the repository root using the __file__ location instead of searching for config/config.yml, simplifying the logic and improving reliability.
- Removed the previous error handling that raised a RuntimeError if the configuration file was not found, as the new approach assumes config_manager.py is always at the repo root.

* Enhance speaker recognition service integration and error handling (#245)

* Enhance speaker recognition service integration and error handling

- Updated `docker-compose-test.yml` to enable speaker recognition in the test environment and added a new `speaker-service-test` service for testing purposes.
- Refactored `run-test.sh` to improve the execution of Robot Framework tests from the repository root.
- Enhanced error handling in `speaker_recognition_client.py` to return detailed error messages for connection issues.
- Improved error logging in `speaker_jobs.py` to handle and report errors from the speaker recognition service more effectively.
- Updated `Dockerfile` to copy the full source code after dependencies are cached, ensuring all necessary files are included in the image.

* Remove integration tests workflow and enhance robot tests with HF_TOKEN verification

- Deleted the `integration-tests.yml` workflow file to streamline CI processes.
- Updated `robot-tests.yml` to include verification for the new `HF_TOKEN` secret, ensuring all required secrets are checked before running tests.

* Fix key access in system admin tests to use string indexing for speakers data

* Refactor Robot Framework tests and enhance error handling in memory services

- Removed the creation of the test environment file from the GitHub Actions workflow to streamline setup.
- Updated the Robot Framework tests to utilize a unified test script for improved consistency.
- Enhanced error messages in the MemoryService class to provide more context on connection failures for LLM and vector store providers.
- Added critical checks for API key presence in the OpenAIProvider class to ensure valid credentials are provided before proceeding.
- Adjusted various test setup scripts to use a centralized BACKEND_DIR variable for better maintainability and clarity.

* Refactor test container cleanup in run-robot-tests.sh

- Updated the script to dynamically construct container names from docker-compose services, improving maintainability and reducing hardcoded values.
- Enhanced the cleanup process for stuck test containers by utilizing the COMPOSE_PROJECT_NAME variable.

* Enhance run-robot-tests.sh for improved logging and cleanup

- Set absolute paths for consistent directory references to simplify navigation.
- Capture container logs, status, and resource usage for better debugging.
- Refactor cleanup process to utilize dynamic backend directory references, improving maintainability.
- Ensure proper navigation back to the tests directory after operations.

* Add speaker recognition configuration and update test script defaults

- Introduced speaker recognition settings in config.yml.template, allowing for easy enable/disable and service URL configuration.
- Updated run-robot-tests.sh to use a test-specific configuration file that disables speaker recognition for improved CI performance.
- Modified deepgram-openai.yml to disable speaker recognition during CI tests to enhance execution speed.

* Refactor speaker recognition configuration management

- Updated docker-compose-test.yml to clarify speaker recognition settings, now controlled via config.yml for improved CI performance.
- Enhanced model_registry.py to include a dedicated speaker_recognition field for better configuration handling.
- Modified speaker_recognition_client.py to load configuration from config.yml, allowing for dynamic enabling/disabling of the speaker recognition service based on the configuration.

* Add minimum worker count verification to infrastructure tests

- Introduced a new keyword to verify that the minimum number of workers are registered, enhancing the robustness of health checks.
- Updated the worker count validation test to include a wait mechanism for worker registration, improving test reliability.
- Clarified comments regarding expected worker counts to reflect the distinction between RQ and audio stream workers.

* Update configuration management and enhance model handling

- Added OBSIDIAN_ENABLED configuration to ChronicleSetup for improved feature toggling.
- Introduced speaker_recognition configuration handling in model_registry.py to streamline model loading.
- Refactored imports in deepgram.py to improve clarity and reduce redundancy.

* Refactor configuration management in wizard and ChronicleSetup (#246)

* Refactor configuration management in wizard and ChronicleSetup

- Updated wizard.py to read Obsidian/Neo4j configuration from config.yml, enhancing flexibility and error handling.
- Refactored ChronicleSetup to utilize ConfigManager for loading and verifying config.yml, ensuring a single source of truth.
- Improved user feedback for missing configuration files and streamlined the setup process for memory and transcription providers.

* Fix string formatting for error message in ChronicleSetup

* added JWT issuers for audience auth for service interop and shared us… (#250)

* added JWT issuers for audience auth for service interop and shared user accounts

* amended default value in line wioth code

* Feat/edit chat system prompt (#247)

* Refactor configuration management in wizard and ChronicleSetup

- Updated wizard.py to read Obsidian/Neo4j configuration from config.yml, enhancing flexibility and error handling.
- Refactored ChronicleSetup to utilize ConfigManager for loading and verifying config.yml, ensuring a single source of truth.
- Improved user feedback for missing configuration files and streamlined the setup process for memory and transcription providers.

* Fix string formatting for error message in ChronicleSetup

* Enhance chat configuration management and UI integration

- Updated `services.py` to allow service restart with an option to recreate containers, addressing WSL2 bind mount issues.
- Added new chat configuration management functions in `system_controller.py` for loading, saving, and validating chat prompts.
- Introduced `ChatSettings` component in the web UI for admin users to manage chat configurations easily.
- Updated API service methods in `api.ts` to support chat configuration endpoints.
- Integrated chat settings into the system management page for better accessibility.

* Refactor backend shutdown process and enhance chat service configuration logging

- Updated `start.sh` to improve shutdown handling by explicitly killing the backend process if running.
- Modified `chat_service.py` to enhance logging for loading chat system prompts, providing clearer feedback on configuration usage.
- Added a new `chat` field in `model_registry.py` for better chat service configuration management.
- Updated vector store query parameters in `vector_stores.py` for improved clarity and functionality.
- Enhanced the chat component in the web UI to conditionally auto-scroll based on message sending status.

* Return JSONResponse instead of raw result

* Refactor headers creation in system admin tests

* Make config.yml writable for admin updates

* Docs consolidation (#257)

* Enhance setup documentation and convenience scripts

- Updated the interactive setup wizard instructions to recommend using the convenience script `./wizard.sh` for easier configuration.
- Added detailed instructions for uploading and processing existing audio files via the API, including example commands for single and multiple file uploads.
- Introduced a new section on HAVPE relay configuration for ESP32 audio streaming, providing environment variable setup and command examples.
- Clarified the distributed deployment setup, including GPU and backend separation instructions, and added benefits of using Tailscale for networking.
- Removed outdated `getting-started.md` and `SETUP_SCRIPTS.md` files to streamline documentation and avoid redundancy.

* Update setup instructions and enhance service management scripts

- Replaced direct command instructions with convenience scripts (`./wizard.sh` and `./start.sh`) for easier setup and service management.
- Added detailed usage of convenience scripts for checking service status, restarting, and stopping services.
- Clarified the distinction between convenience scripts and direct command usage for improved user guidance.

* Update speaker recognition models and documentation

- Changed the speaker diarization model from `pyannote/speaker-diarization-3.1` to `pyannote/speaker-diarization-community-1` across multiple files for consistency.
- Updated README files to reflect the new model and its usage instructions, ensuring users have the correct links and information for setup.
- Enhanced clarity in configuration settings related to speaker recognition.

* Docs consolidation (#258)

* Enhance setup documentation and convenience scripts

- Updated the interactive setup wizard instructions to recommend using the convenience script `./wizard.sh` for easier configuration.
- Added detailed instructions for uploading and processing existing audio files via the API, including example commands for single and multiple file uploads.
- Introduced a new section on HAVPE relay configuration for ESP32 audio streaming, providing environment variable setup and command examples.
- Clarified the distributed deployment setup, including GPU and backend separation instructions, and added benefits of using Tailscale for networking.
- Removed outdated `getting-started.md` and `SETUP_SCRIPTS.md` files to streamline documentation and avoid redundancy.

* Update setup instructions and enhance service management scripts

- Replaced direct command instructions with convenience scripts (`./wizard.sh` and `./start.sh`) for easier setup and service management.
- Added detailed usage of convenience scripts for checking service status, restarting, and stopping services.
- Clarified the distinction between convenience scripts and direct command usage for improved user guidance.

* Update speaker recognition models and documentation

- Changed the speaker diarization model from `pyannote/speaker-diarization-3.1` to `pyannote/speaker-diarization-community-1` across multiple files for consistency.
- Updated README files to reflect the new model and its usage instructions, ensuring users have the correct links and information for setup.
- Enhanced clarity in configuration settings related to speaker recognition.

* Enhance transcription provider selection and update HTTPS documentation

- Added a new function in `wizard.py` to prompt users for their preferred transcription provider, allowing options for Deepgram, Parakeet ASR, or none.
- Updated the service setup logic to automatically include ASR services if Parakeet is selected.
- Introduced a new documentation file on SSL certificates and HTTPS setup, detailing the importance of HTTPS for secure connections and microphone access.
- Removed outdated HTTPS setup documentation from `backends/advanced/Docs/HTTPS_SETUP.md` to streamline resources.

* Remove HTTPS setup scripts and related configurations

- Deleted `init-https.sh`, `setup-https.sh`, and `nginx.conf.template` as part of the transition to a new HTTPS setup process.
- Updated `README.md` to reflect the new automatic HTTPS configuration via the setup wizard.
- Adjusted `init.py` to remove references to the deleted HTTPS scripts and ensure proper handling of Caddyfile generation for SSL.
- Streamlined documentation to clarify the new approach for HTTPS setup and configuration management.

* Update quickstart.md (#268)

* v0.2 (#279)

* Refactor configuration management in wizard and ChronicleSetup

- Updated wizard.py to read Obsidian/Neo4j configuration from config.yml, enhancing flexibility and error handling.
- Refactored ChronicleSetup to utilize ConfigManager for loading and verifying config.yml, ensuring a single source of truth.
- Improved user feedback for missing configuration files and streamlined the setup process for memory and transcription providers.

* Fix string formatting for error message in ChronicleSetup

* Enhance chat configuration management and UI integration

- Updated `services.py` to allow service restart with an option to recreate containers, addressing WSL2 bind mount issues.
- Added new chat configuration management functions in `system_controller.py` for loading, saving, and validating chat prompts.
- Introduced `ChatSettings` component in the web UI for admin users to manage chat configurations easily.
- Updated API service methods in `api.ts` to support chat configuration endpoints.
- Integrated chat settings into the system management page for better accessibility.

* Refactor backend shutdown process and enhance chat service configuration logging

- Updated `start.sh` to improve shutdown handling by explicitly killing the backend process if running.
- Modified `chat_service.py` to enhance logging for loading chat system prompts, providing clearer feedback on configuration usage.
- Added a new `chat` field in `model_registry.py` for better chat service configuration management.
- Updated vector store query parameters in `vector_stores.py` for improved clarity and functionality.
- Enhanced the chat component in the web UI to conditionally auto-scroll based on message sending status.

* Implement plugin system for enhanced functionality and configuration management

- Introduced a new plugin architecture to allow for extensibility in the Chronicle application.
- Added Home Assistant plugin for controlling devices via natural language commands triggered by wake words.
- Implemented plugin configuration management endpoints in the API for loading, saving, and validating plugin settings.
- Enhanced the web UI with a dedicated Plugins page for managing plugin configurations.
- Updated Docker Compose files to include Tailscale integration for remote service access.
- Refactored existing services to support plugin interactions during conversation and memory processing.
- Improved error handling and logging for plugin initialization and execution processes.

* Enhance configuration management and plugin system integration

- Updated .gitignore to include plugins.yml for security reasons.
- Modified start.sh to allow passing additional arguments during service startup.
- Refactored wizard.py to support new HF_TOKEN configuration prompts and improved handling of wake words in plugin settings.
- Introduced a new setup_hf_token_if_needed function to streamline Hugging Face token management.
- Enhanced the GitHub Actions workflow to create plugins.yml from a template, ensuring proper configuration setup.
- Added detailed comments and documentation in the plugins.yml.template for better user guidance on Home Assistant integration.

* Implement Redis integration for client-user mapping and enhance wake word processing

- Added asynchronous Redis support in ClientManager for tracking client-user relationships.
- Introduced `initialize_redis_for_client_manager` to set up Redis for cross-container mapping.
- Updated `create_client_state` to use asynchronous tracking for client-user relationships.
- Enhanced wake word processing in PluginRouter with normalization and command extraction.
- Refactored DeepgramStreamingConsumer to utilize async Redis lookups for user ID retrieval.
- Set TTL on Redis streams during client state cleanup for better resource management.

* Refactor Deepgram worker management and enhance text normalization

- Disabled the batch Deepgram worker in favor of the streaming worker to prevent race conditions.
- Updated text normalization in wake word processing to replace punctuation with spaces, preserving word boundaries.
- Enhanced regex pattern for wake word matching to allow optional punctuation and whitespace after the last part.
- Improved logging in DeepgramStreamingConsumer for better visibility of message processing and error handling.

* Add original prompt retrieval and restoration in chat configuration test

- Implemented retrieval of the original chat prompt before saving a custom prompt to ensure test isolation.
- Added restoration of the original prompt after the test to prevent interference with subsequent tests.
- Enhanced the test documentation for clarity on the purpose of these changes.

* Refactor test execution and enhance documentation for integration tests

- Simplified test execution commands in CLAUDE.md and quickstart.md for better usability.
- Added instructions for running tests from the project root and clarified the process for executing the complete Robot Framework test suite.
- Introduced a new Docker service for the Deepgram streaming worker in docker-compose-test.yml to improve testing capabilities.
- Updated system_admin_tests.robot to use a defined default prompt for restoration, enhancing test reliability and clarity.

* Enhance test environment cleanup and improve Deepgram worker management

- Updated `run-test.sh` and `run-robot-tests.sh` to improve cleanup processes, including handling permission issues with Docker.
- Introduced a new function `mark_session_complete` in `session_controller.py` to ensure atomic updates for session completion status.
- Refactored WebSocket and conversation job handling to utilize the new session completion function, enhancing reliability.
- Updated `start-workers.sh` to enable the batch Deepgram worker alongside the streaming worker for improved transcription capabilities.
- Enhanced test scripts to verify the status of Deepgram workers and ensure proper cleanup of test containers.

* Refactor worker management and introduce orchestrator for improved process handling

- Replaced the bash-based `start-workers.sh` script with a Python-based worker orchestrator for better process management and health monitoring.
- Updated `docker-compose.yml` to configure the new orchestrator and adjust worker definitions, including the addition of audio persistence and stream workers.
- Enhanced the Dockerfile to remove the old startup script and ensure the orchestrator is executable.
- Introduced new modules for orchestrator configuration, health monitoring, process management, and worker registry to streamline worker lifecycle management.
- Improved environment variable handling for worker configuration and health checks.

* oops

* oops2

* Remove legacy test runner script and update worker orchestration

- Deleted the `run-test.sh` script, which was used for local test execution.
- Updated Docker configurations to replace the `start-workers.sh` script with `worker_orchestrator.py` for improved worker management.
- Enhanced health monitoring and process management in the orchestrator to ensure better reliability and logging.
- Adjusted deployment configurations to reflect the new orchestrator setup.

* Add bulk restart mechanism for RQ worker registration loss

- Introduced a new method `_handle_registration_loss` to manage RQ worker registration loss, replicating the behavior of the previous bash script.
- Implemented a cooldown period to prevent frequent restarts during network issues.
- Added logging for bulk restart actions and their outcomes to enhance monitoring and debugging capabilities.
- Created a `_restart_all_rq_workers` method to facilitate the bulk restart of RQ workers, ensuring they re-register with Redis upon startup.

* Enhance plugin architecture with event-driven system and test integration

- Introduced a new Test Event Plugin to log all plugin events to an SQLite database for integration testing.
- Updated the plugin system to utilize event subscriptions instead of access levels, allowing for more flexible event handling.
- Refactored the PluginRouter to dispatch events based on subscriptions, improving the event-driven architecture.
- Enhanced Docker configurations to support development and testing environments with appropriate dependencies.
- Added comprehensive integration tests to verify the functionality of the event dispatch system and plugin interactions.
- Updated documentation and test configurations to reflect the new event-based plugin structure.

* Enhance Docker configurations and startup script for test mode

- Updated `docker-compose-test.yml` to include a test command for services, enabling a dedicated test mode.
- Modified `start.sh` to support a `--test` flag, allowing the FastAPI backend to run with test-specific configurations.
- Adjusted worker commands to utilize the `--group test` option in test mode for improved orchestration and management.

* Refactor test scripts for improved reliability and clarity

- Updated `run-robot-tests.sh` to enhance the verification of the Deepgram batch worker process, ensuring non-numeric characters are removed from the check.
- Modified `plugin_tests.robot` to use a more explicit method for checking the length of subscriptions and added a skip condition for unavailable audio files.
- Adjusted `plugin_event_tests.robot` to load the test audio file from a variable, improving test data management.
- Refactored `plugin_keywords.robot` to utilize clearer length checks for subscriptions and event parts, enhancing readability and maintainability.

* remove mistral deadcode; notebooks untouched

* Refactor audio streaming endpoints and improve documentation

- Updated WebSocket endpoints to use a unified format with codec parameters (`/ws?codec=pcm` and `/ws?codec=opus`) for audio streaming, replacing the previous `/ws_pcm` and `/ws_omi` endpoints.
- Enhanced documentation to reflect the new endpoint structure and clarify audio processing capabilities.
- Removed deprecated audio cropping functionality and related configurations to streamline the audio processing workflow.
- Updated various components and scripts to align with the new endpoint structure, ensuring consistent usage across the application.

* Enhance testing infrastructure and API routes for plugin events

- Updated `docker-compose-test.yml` to introduce low speech detection thresholds for testing, improving the accuracy of speech detection during tests.
- Added new test-only API routes in `test_routes.py` for clearing and retrieving plugin events, ensuring a clean state between tests.
- Refactored existing test scripts to utilize the new API endpoints for event management, enhancing test reliability and clarity.
- Improved logging and error handling in various components to facilitate debugging during test execution.
- Adjusted environment variable handling in test setup scripts to streamline configuration and improve flexibility.

* Add audio pipeline architecture documentation and improve audio persistence worker configuration

- Introduced a comprehensive documentation file detailing the audio pipeline architecture, covering data flow, processing stages, and key components.
- Enhanced the audio persistence worker setup by implementing multiple concurrent workers to improve audio processing efficiency.
- Adjusted sleep intervals in the audio streaming persistence job for better responsiveness and event loop yielding.
- Updated test script to run the full suite of integration tests from the specified directory, ensuring thorough testing coverage.

* Add test container setup and teardown scripts

- Introduced `setup-test-containers.sh` for streamlined startup of test containers, including health checks and environment variable loading.
- Added `teardown-test-containers.sh` for simplified container shutdown, with options to remove volumes.
- Enhanced user feedback with color-coded messages for better visibility during test setup and teardown processes.

* Update worker count validation and websocket disconnect tests

- Adjusted worker count expectations in the Worker Count Validation Test to reflect an increase from 7 to 9 workers, accounting for additional audio persistence workers.
- Enhanced the WebSocket Disconnect Conversation End Reason Test by adding steps to maintain audio streaming during disconnection, ensuring accurate simulation of network dropout scenarios.
- Improved comments for clarity and added critical notes regarding inactivity timeout handling.

* Refactor audio storage to MongoDB chunks and enhance cleanup settings management

- Replaced the legacy AudioFile model with AudioChunkDocument for storing audio data in MongoDB, optimizing storage and retrieval.
- Introduced CleanupSettings dataclass for managing soft-deletion configurations, including auto-cleanup and retention days.
- Added admin API routes for retrieving and saving cleanup settings, ensuring better control over data retention policies.
- Updated audio processing workflows to utilize MongoDB chunks, removing dependencies on disk-based audio files.
- Enhanced tests to validate the new audio chunk storage and cleanup functionalities, ensuring robust integration with existing systems.

* Refactor audio processing to utilize MongoDB chunks and enhance job handling

- Removed audio file path parameters from various functions, transitioning to audio data retrieval from MongoDB chunks.
- Updated the `start_post_conversation_jobs` function to reflect changes in audio handling, ensuring jobs reconstruct audio from database chunks.
- Enhanced the `transcribe_full_audio_job` and `recognise_speakers_job` to process audio directly from memory, eliminating the need for temporary files.
- Improved error handling and logging for audio data retrieval, ensuring better feedback during processing.
- Added a new utility function for converting PCM data to WAV format in memory, streamlining audio format handling.

* Refactor speaker recognition client to use in-memory audio data

- Updated methods to accept audio data as bytes instead of file paths, enhancing performance by eliminating disk I/O.
- Improved logging to reflect in-memory audio processing, providing better insights during speaker identification and diarization.
- Streamlined audio data handling in the `diarize_identify_match` and `diarize_and_identify` methods, ensuring consistency across the client.
- Removed temporary file handling, simplifying the audio processing workflow and reducing potential file system errors.

* Add mock providers and update testing workflows for API-independent execution

- Introduced `MockLLMProvider` and `MockTranscriptionProvider` to facilitate testing without external API dependencies, allowing for consistent and controlled test environments.
- Created `run-no-api-tests.sh` script to execute tests that do not require API keys, ensuring separation of API-dependent and independent tests.
- Updated Robot Framework test configurations to utilize mock services, enhancing test reliability and reducing external dependencies.
- Modified existing test workflows to include new configurations and ensure proper handling of results for tests excluding API keys.
- Added `mock-services.yml` configuration to disable external API services while maintaining core functionality for testing purposes.
- Enhanced documentation to reflect the new tagging system for tests requiring API keys, improving clarity on test execution requirements.

* Enhance testing documentation and workflows for API key separation

- Updated CLAUDE.md to clarify test execution modes, emphasizing the separation of tests requiring API keys from those that do not.
- Expanded the testing guidelines in TESTING_GUIDELINES.md to detail the organization of tests based on API dependencies, including tagging conventions and execution paths.
- Improved mock-services.yml to include dummy configurations for LLM and embedding services, ensuring tests can run without actual API calls.
- Added comprehensive documentation on GitHub workflows for different test scenarios, enhancing clarity for contributors and maintainers.

* Update test configurations and documentation for API key management

- Modified `plugins.yml.template` to implement event subscriptions for the Home Assistant plugin, enhancing its event-driven capabilities.
- Revised `README.md` to clarify test execution processes, emphasizing the distinction between tests requiring API keys and those that do not.
- Updated `mock-services.yml` to streamline mock configurations, ensuring compatibility with the new testing workflows.
- Added `requires-api-keys` tags to relevant test cases across various test files, improving organization and clarity regarding API dependencies.
- Enhanced documentation for test scripts and configurations, providing clearer guidance for contributors on executing tests based on API key requirements.

* Add optional service profile to Docker Compose test configuration

* Refactor audio processing and job handling for transcription workflows

- Updated `upload_and_process_audio_files` and `start_post_conversation_jobs` to enqueue transcription jobs separately for file uploads, ensuring accurate processing order.
- Enhanced logging to provide clearer insights into job enqueuing and processing stages.
- Removed batch transcription from the post-conversation job chain for streaming audio, utilizing the streaming transcript directly.
- Introduced word-level timestamps in the `Conversation` model to improve transcript detail and accuracy.
- Updated tests to reflect changes in job handling and ensure proper verification of post-conversation processing.

* Remove unnecessary network aliases from speaker service in Docker Compose configuration

* Add network aliases for speaker service in Docker Compose configuration

* Refactor Conversation model to use string for provider field

- Updated the `Conversation` model to replace the `TranscriptProvider` enum with a string type for the `provider` field, allowing for greater flexibility in provider names.
- Adjusted related job functions to accommodate this change, simplifying provider handling in the transcription workflow.

* Enhance configuration and model handling for waveform data

- Updated Docker Compose files to mount the entire config directory, allowing for better management of configuration files.
- Introduced a new `WaveformData` model to store pre-computed waveform visualization data, improving UI performance by enabling waveform display without real-time decoding.
- Enhanced the `app_factory` and `job` models to include the new `WaveformData` model, ensuring proper initialization and data handling.
- Implemented waveform generation logic in a new worker module, allowing for on-demand waveform creation from audio chunks.
- Added API endpoints for retrieving and generating waveform data, improving the overall audio processing capabilities.
- Updated tests to cover new functionality and ensure robustness in waveform data handling.

* Add SDK testing scripts for authentication, conversation retrieval, and audio upload

- Introduced three new test scripts: `sdk_test_auth.py`, `sdk_test_conversations.py`, and `sdk_test_upload.py`.
- Each script tests different functionalities of the SDK, including authentication, conversation retrieval, and audio file uploads.
- The scripts utilize the `ChronicleClient` to perform operations and print results for verification.
- Enhanced testing capabilities for the SDK, ensuring robust validation of core features.

* Enhance audio processing and conversation handling for large files

- Added configuration options for speaker recognition chunking in `.env.template`, allowing for better management of large audio files.
- Updated `get_conversations` function to include an `include_deleted` parameter for filtering conversations based on their deletion status.
- Enhanced `finalize_session` method in `AudioStreamProducer` to send an end marker to Redis, ensuring proper session closure.
- Introduced `reconstruct_audio_segments` function to yield audio segments with overlap for efficient processing of lengthy conversations.
- Implemented merging of overlapping speaker segments to improve accuracy in speaker recognition.
- Added integration tests for WebSocket streaming transcription to validate the end_marker functionality and overall transcription flow.

* archive

* Implement annotation system and enhance audio processing capabilities

- Introduced a new annotation model to support user edits and AI-powered suggestions for memories and transcripts.
- Added annotation routes for CRUD operations, enabling the creation and management of annotations via the API.
- Enhanced the audio processing workflow to support fetching audio segments from the backend, improving speaker recognition accuracy.
- Updated the speaker recognition client to handle conversation-based audio fetching, allowing for better management of large audio files.
- Implemented a cron job for generating AI suggestions on potential errors in transcripts and memories, improving user experience and content accuracy.
- Enhanced the web UI to support inline editing of transcript segments and memory content, providing a more interactive user experience.
- Updated configuration files to support new features and improve overall system flexibility.

* Implement OmegaConf-based configuration management for backend settings

- Introduced a new configuration loader using OmegaConf for unified management of backend settings.
- Updated existing configuration functions to leverage the new loader, enhancing flexibility and maintainability.
- Added support for environment variable interpolation in configuration files.
- Refactored various components to retrieve settings from the new configuration system, improving consistency across the application.
- Updated requirements to include OmegaConf as a dependency.
- Enhanced documentation and comments for clarity on configuration management.

* Refactor .env.template and remove unused diarization configuration

- Updated the .env.template to clarify its purpose for secret values and streamline setup instructions.
- Removed the deprecated diarization_config.json.template file, as it is no longer needed.
- Added new environment variables for Langfuse and Tailscale integration to enhance observability and remote service access.

* Implement legacy environment variable syntax support in configuration loader

- Added custom OmegaConf resolvers to handle legacy ${VAR:-default} syntax for backward compatibility.
- Introduced a preprocessing function to convert legacy syntax in YAML files to OmegaConf-compatible format.
- Updated the load_config function to utilize the new preprocessing for loading defaults and user configurations.
- Enhanced documentation for clarity on the new legacy syntax handling.

* Add plugins configuration path retrieval and refactor usage

- Introduced a new function `get_plugins_yml_path` to centralize the retrieval of the plugins.yml file path.
- Updated `system_controller.py` and `plugin_service.py` to use the new function for improved maintainability and consistency in accessing the plugins configuration.
- Enhanced code clarity by removing hardcoded paths and utilizing the centralized configuration method.

* Unify plugin terminology and fix memory job dependencies

Plugin terminology: subscriptions→events, trigger→condition
Memory jobs: no longer blocked by disabled speaker recognition

* Update Docker Compose configuration and enhance system routes

- Updated Docker Compose files to mount the entire config directory, consolidating configuration management.
- Refactored the `save_diarization_settings` function to improve clarity and maintainability by renaming it to `save_diarization_settings_controller`.
- Enhanced the System component in the web UI to include configuration diagnostics, providing better visibility into system health and issues.

* circular import

* Refactor testing infrastructure and enhance container management

- Updated the testing documentation to reflect a new Makefile-based approach for running tests and managing containers.
- Introduced new scripts for container management, including starting, stopping, restarting, and cleaning containers while preserving logs.
- Added a cleanup script to handle data ownership and permissions correctly.
- Implemented a logging system that saves container logs automatically before cleanup.
- Enhanced the README with detailed instructions for running tests and managing the test environment.

* Add Email Summarizer Plugin and SMTP Email Service

- Introduced the Email Summarizer Plugin that automatically sends email summaries upon conversation completion.
- Implemented SMTP Email Service for sending emails, supporting HTML and plain text formats with TLS/SSL encryption.
- Added configuration options for SMTP settings in the .env.template and plugins.yml.template.
- Created comprehensive documentation for plugin development and usage, including a new plugin generation script.
- Enhanced testing coverage for the Email Summarizer Plugin and SMTP Email Service to ensure reliability and functionality.

* Refactor plugin management and introduce Email Summarizer setup

- Removed the static PLUGINS dictionary and replaced it with a dynamic discovery mechanism for plugins.
- Implemented a new setup process for plugins, allowing for configuration via individual setup scripts.
- Added the Email Summarizer plugin with a dedicated setup script for SMTP configuration.
- Enhanced the main setup flow to support community plugins and their configuration.
- Cleaned up unused functions related to plugin configuration and streamlined the overall plugin setup process.

* Enhance plugin configuration and documentation

- Updated the .env.template to include new configuration options for the Home Assistant and Email Summarizer plugins, including server URLs, tokens, and additional settings.
- Refactored Docker Compose files to correctly mount plugin configuration paths.
- Introduced comprehensive documentation for plugin configuration architecture, detailing the separation of concerns for orchestration, settings, and secrets.
- Added individual configuration files for the Home Assistant and Email Summarizer plugins, ensuring proper management of non-secret settings and environment variable references.
- Improved the plugin loading process to merge configurations from multiple sources, enhancing flexibility and maintainability.

* Refactor plugin setup process to allow interactive user input

- Updated the plugin setup script to run interactively, enabling plugins to prompt for user input during configuration.
- Removed output capturing to facilitate real-time interaction and improved error messaging to include exit codes for better debugging.

* Add shared setup utilities for interactive configuration

- Introduced `setup_utils.py` containing functions for reading environment variables, prompting user input, and masking sensitive values.
- Refactored existing code in `wizard.py` and `init.py` to utilize these shared utilities, improving code reuse and maintainability.
- Updated documentation to include usage examples for the new utilities in plugin setup scripts, enhancing developer experience and clarity.

* Enhance plugin security architecture and configuration management

- Introduced a three-file separation for plugin configuration to improve security:
  - `backends/advanced/.env` for secrets (gitignored)
  - `config/plugins.yml` for orchestration with environment variable references
  - `plugins/{plugin_id}/config.yml` for non-secret defaults
- Updated documentation to emphasize the importance of using `${ENV_VAR}` syntax for sensitive data and provided examples of correct usage.
- Enhanced the Email Summarizer plugin setup process to automatically update `config/plugins.yml` with environment variable references, ensuring secrets are not hardcoded.
- Added new fields to the User model for notification email management and improved error logging in user-related functions.
- Refactored audio chunk utilities to use a consistent method for fetching conversation metadata.

* Refactor backend components for improved functionality and stability

- Added a new parameter `transcript_version_id` to the `open_conversation_job` function to support streaming transcript versioning.
- Enhanced error handling in `check_enrolled_speakers_job` and `recognise_speakers_job` to allow conversations to proceed even when the speaker service is unavailable, improving resilience.
- Updated `send_to_adv.py` to support dynamic WebSocket and HTTP protocols based on environment settings, enhancing configuration flexibility.
- Introduced a background task in `send_to_adv.py` to handle incoming messages from the backend, ensuring connection stability and logging interim results.

* Refactor plugin setup timing to enhance configuration flow

* Refactor save_diarization_settings_controller to improve validation and error handling

- Updated the controller to filter out invalid settings instead of raising an error for each unknown key, allowing for more flexible input.
- Added a check to reject requests with no valid settings provided, enhancing robustness.
- Adjusted logging to reflect the filtered settings being saved.

* Refactor audio processing and conversation management for improved deduplication and tracking

* Refactor audio and email handling for improved functionality and security

- Updated `mask_value` function to handle whitespace more effectively.
- Enhanced `create_plugin` to remove existing directories when using the `--force` option.
- Changed logging level from error to debug for existing admin user checks.
- Improved client ID generation logging for clarity.
- Removed unused fields from conversation creation.
- Added HTML escaping in email templates to prevent XSS attacks.
- Updated audio file download function to include user ID for better tracking.
- Adjusted WebSocket connection settings to respect SSL verification based on environment variables.

* Refactor audio upload functionality to remove unused parameters

- Removed `auto_generate_client` and `folder` parameters from audio upload functions to streamline the API.
- Updated related function calls and documentation to reflect these changes, enhancing clarity and reducing complexity.

* Refactor Email Summarizer plugin configuration for improved clarity and security

- Removed outdated migration instructions from `plugin-configuration.md` to streamline documentation.
- Enhanced `README.md` to clearly outline the three-file separation for plugin configuration, emphasizing the roles of `.env`, `config.yml`, and `plugins.yml`.
- Updated `setup.py` to reflect changes in orchestration settings, ensuring only relevant configurations are included in `config/plugins.yml`.
- Improved security messaging to highlight the importance of not committing secrets to version control.

* Update API key configuration in config.yml.template to use environment variable syntax for improved flexibility and security. This change standardizes the way API keys are referenced across different models and services. (#273)

Co-authored-by: roshan.john <roshanjohn1460@gmail.com>

* Refactor Redis job queue cleanup process for improved success tracking

- Replaced total job count with separate counters for successful and failed jobs during Redis queue cleanup.
- Enhanced logging to provide detailed feedback on the number of jobs cleared and any failures encountered.
- Improved error handling to ensure job counts are accurately reflected even when exceptions occur.

* fix tests

* Update CI workflows to use 'docker compose' for log retrieval and added container status check

- Replaced 'docker logs' commands with 'docker compose -f docker-compose-test.yml logs' for consistency across workflows.
- Added a check for running containers before saving logs to enhance debugging capabilities.

* test fixes

* FIX StreamingTranscriptionConsumer to support cumulative audio timestamp adjustments

- Added `audio_offset_seconds` to track cumulative audio duration for accurate timestamp adjustments across transcription sessions.
- Updated `store_final_result` method to adjust word and segment timestamps based on cumulative audio offset.
- Improved logging to reflect changes in audio offset after storing results.
- Modified Makefile and documentation to clarify test execution options, including new tags for slow and SDK tests, enhancing test organization and execution clarity.

* Enhance test container setup and improve error messages in integration tests

- Set `COMPOSE_PROJECT_NAME` for test containers to ensure consistent naming.
- Consolidated error messages in the `websocket_transcription_e2e_test.robot` file for clarity, improving readability and debugging.

* Improve WebSocket closing logic and enhance integration test teardown

- Added timeout handling for WebSocket closure in `AudioStreamClient` to prevent hanging and ensure clean disconnection.
- Updated integration tests to log the total chunks sent when closing audio streams, improving clarity on resource management during test teardown.

* Refactor job status handling to align with RQ standards

- Updated job status checks across various modules to use "started" and "finished" instead of "processing" and "completed" for consistency with RQ's naming conventions.
- Adjusted related logging and response messages to reflect the new status terminology.
- Simplified Docker Compose project name handling in test scripts to avoid conflicts and improve clarity in test environment setup.

* Update test configurations and improve audio inactivity handling

- Increased `SPEECH_INACTIVITY_THRESHOLD_SECONDS` to 20 seconds in `docker-compose-test.yml` for better audio duration handling during tests.
- Refactored session handling in `session_controller.py` to clarify client ID usage.
- Updated `conversation_utils.py` to track speech activity using audio timestamps, enhancing accuracy in inactivity detection.
- Simplified test scripts by removing unnecessary `COMPOSE_PROJECT_NAME` references, aligning with the new project naming convention.
- Adjusted integration tests to reflect changes in inactivity timeout and ensure proper handling of audio timestamps.

* Refactor audio processing and enhance error handling

- Updated `worker_orchestrator.py` to use `logger.exception` for improved error logging.
- Changed default MongoDB database name from "friend-lite" to "chronicle" in multiple files for consistency.
- Added a new method `close_stream_without_stop` in `audio_stream_client.py` to handle abrupt WebSocket disconnections.
- Enhanced audio validation in `audio_utils.py` to support automatic resampling of audio data if sample rates do not match.
- Improved logging in various modules to provide clearer insights during audio processing and event dispatching.

* Enhance Docker command handling and configuration management

- Updated `run_compose_command` to support separate build commands for services, including profile management for backend and speaker-recognition services.
- Improved error handling and output streaming during Docker command execution.
- Added `ensure_docker_network` function to verify and create the required Docker network before starting services.
- Refactored configuration files to utilize `oc.env` for environment variable management, ensuring better compatibility and flexibility across different environments.

* Enhance configuration loading to support custom config file paths

- Added support for the CONFIG_FILE environment variable to allow specifying custom configuration files for testing.
- Implemented logic to handle both absolute paths and relative filenames for the configuration file, improving flexibility in configuration management.

* Update test scripts to use TEST_CONFIG_FILE for configuration management

- Replaced CONFIG_FILE with TEST_CONFIG_FILE in both run-no-api-tests.sh and run-robot-tests.sh to standardize configuration file usage.
- Updated paths to point to mock and deepgram-openai configuration files inside the container, improving clarity and consistency in test setups.

* Refactor audio upload response handling and improve error reporting

- Updated `upload_and_process_audio_files` to return appropriate HTTP status codes based on upload results: 400 for all failures, 207 for partial successes, and 200 for complete success.
- Enhanced error messages in the audio upload tests to provide clearer feedback on upload failures, including specific error details for better debugging.
- Adjusted test scripts to ensure consistent handling of conversation IDs in job metadata, improving validation checks for job creation.

* Refactor audio processing and job handling to improve transcription management

- Updated `upload_and_process_audio_files` to check for transcription provider availability before enqueueing jobs, enhancing error handling and logging.
- Modified `start_post_conversation_jobs` to conditionally enqueue memory extraction jobs based on configuration, improving flexibility in job management.
- Enhanced event dispatch job dependencies to only include jobs that were actually enqueued, ensuring accurate job tracking.
- Added `is_transcription_available` function to check transcription provider status, improving modularity and clarity in the transcription workflow.

* Enhance integration tests for plugin events and improve error handling

- Updated integration tests to filter plugin events by conversation ID, ensuring accurate event tracking and reducing noise from fixture events.
- Improved error messages in event verification to include conversation ID context, enhancing clarity during test failures.
- Refactored audio upload handling to check for transcription job creation, allowing for more robust conversation polling and error reporting.
- Added new keyword to verify conversation end reasons, improving test coverage for conversation state validation.

* Enhance speaker recognition testing and audio processing

- Added mock speaker recognition client to facilitate testing without resource-intensive dependencies.
- Updated Docker Compose configurations to include mock speaker client for test environments.
- Refactored audio segment reconstruction to ensure precise clipping based on time boundaries.
- Improved error handling in transcription jobs and speaker recognition workflows to enhance robustness.
- Adjusted integration tests to utilize real-time pacing for audio chunk streaming, improving test accuracy.

* Refactor audio chunk retrieval and enhance logging in audio processing

- Introduced logging for audio chunk requests to improve traceability.
- Replaced manual audio chunk processing with a dedicated `reconstruct_audio_segment` function for better clarity and efficiency.
- Improved error handling during audio reconstruction to provide more informative responses in case of failures.
- Cleaned up imports and removed redundant code related to audio chunk calculations.

* Refactor mock speaker recognition client and improve testing structure

- Replaced direct import of mock client with a structured import from the new testing module.
- Introduced a dedicated `mock_speaker_client.py` to provide a mock implementation for speaker recognition, facilitating testing without heavy dependencies.
- Added an `__init__.py` file in the testing directory to organize testing utilities and mocks.

* Enhance conversation model to include word-level timestamps and improve transcript handling

- Added a new `words` field to the `Conversation` model for storing word-level timestamps.
- Updated methods to handle word data during transcript version creation, ensuring compatibility with speaker recognition.
- Refactored conversation job processing to utilize the new word structure, improving data integrity and access.
- Enhanced speaker recognition job to read words from the new standardized location, ensuring backward compatibility with legacy data.

* Implement speaker reprocessing feature and enhance timeout calculation

- Added a new endpoint to reprocess speaker identification for existing transcripts, creating a new version with re-identified speakers.
- Introduced a method to calculate proportional timeouts based on audio duration, improving handling of varying audio lengths.
- Updated the speaker recognition client to utilize calculated timeouts during service calls, enhancing responsiveness.
- Refactored conversation and memory controllers to support the new speaker reprocessing functionality, ensuring user access control and job chaining for memory updates.
- Removed unfiltered memory retrieval endpoint to streamline memory management and focus on user-specific data access.

* Enhance fine-tuning functionality and improve speaker recognition integration

- Introduced new fine-tuning routes for processing annotations, allowing for training of the speaker recognition model based on user corrections.
- Added a `get_speaker_by_name` method to the `SpeakerRecognitionClient` for looking up enrolled speakers by name.
- Updated the `Annotation` model to support diarization annotations, including new fields for original and corrected speaker labels.
- Enhanced the API router to include the new fine-tuning routes, improving modularity and organization of the backend services.
- Implemented a speaker name dropdown component in the web UI for selecting and managing speakers during annotation processes.

* Add plugin UI configuration panel and refactor plugin management

This commit introduces a comprehensive plugin configuration UI with the following enhancements:
- Add PluginSettingsForm component for plugin configuration
- Create modular plugin configuration components:
  * EnvVarsSection - manage plugin environment variables
  * FormField - reusable form field component
  * OrchestrationSection - configure plugin orchestration settings
  * PluginConfigPanel - main plugin configuration panel
  * PluginListSidebar - plugin list and navigation
- Update plugin service to support new configuration endpoints
- Enhance system controller and routes for plugin management
- Update Plugins page with new UI components
- Enhance API service with plugin configuration methods

* fix

* Enhance audio processing and conversation management with always_persist feature

- Updated Docker Compose configuration to include mock streaming services for testing.
- Introduced `always_persist` flag in audio stream and conversation management, ensuring audio is saved even if transcription fails.
- Enhanced conversation model to track processing status and persist audio data, improving reliability in audio handling.
- Added integration tests to verify the functionality of the always_persist feature, ensuring audio is correctly stored in various scenarios.
- Improved logging for audio processing and conversation state transitions to facilitate debugging and monitoring.

* Add mock transcription failure configuration for testing

- Introduced a new YAML configuration file to simulate transcription failures using invalid API keys for Deepgram services.
- Configured both standard and streaming speech-to-text models with invalid credentials to facilitate testing of error handling in audio processing.
- Enhanced the testing framework by providing mock models for LLM and embeddings, ensuring comprehensive coverage of failure scenarios.

* Improve logging for transcription job failures and session handling

- Updated logging levels for transcription errors to use error severity, providing clearer insights into issues.
- Distinguish between transcription service failures and legitimate no speech scenarios in session termination logs.
- Enhanced session failure messages to guide users in checking transcription service configurations.

* Implement miscellaneous settings management and enhance audio processing

- Introduced functions to retrieve and save miscellaneous settings, including `always_persist_enabled` and `use_provider_segments`, using OmegaConf.
- Updated the system controller and routes to handle new endpoints for managing miscellaneous settings, ensuring admin access control.
- Refactored audio processing jobs to read the `always_persist_enabled` setting from global configuration, improving audio persistence behavior.
- Enhanced the web UI to allow administrators to view and modify miscellaneous settings, providing better control over audio processing features.
- Added integration tests to verify the functionality of the new settings management, ensuring robust handling of audio persistence scenarios.

* Enhance test framework and conversation handling for audio persistence

- Updated the Makefile to introduce new test commands for running tests with and without API keys, improving CI integration.
- Refactored integration tests to replace static sleep calls with polling mechanisms for conversation creation, enhancing reliability and reducing flakiness.
- Added a new keyword to wait for conversations by client ID, streamlining test logic and improving readability.
- Updated documentation in the Makefile to reflect changes in test commands and configurations.

* Implement OpenMemory user registration and enhance MCP client functionality

- Added an asynchronous function to initialize and register an OpenMemory user if the OpenMemory MCP provider is configured, improving user management.
- Enhanced the MCPClient to accept custom metadata when adding memories, allowing for better tracking and filtering of memories by user.
- Updated the OpenMemoryMCPService to utilize the configured OpenMemory user for memory operations, ensuring accurate user context in memory processing.
- Modified integration tests to use shorter device names for consistency and to avoid truncation issues, improving test reliability.

* Add Dockerfiles for mock LLM and streaming STT servers

- Created Dockerfile for a mock LLM server, including dependencies and configuration for running the server on port 11435.
- Created Dockerfile for a mock streaming STT server, including dependencies and configuration for running the server on port 9999.
- Both Dockerfiles streamline the setup process for testing related functionalities.

---------

Co-authored-by: Roshan John <63011948+roshatron2@users.noreply.github.com>
Co-authored-by: roshan.john <roshanjohn1460@gmail.com>

* Refactor Home Assistant plugin and MCP client for improved configuration and error handling (#280)

- Updated HomeAssistantPlugin to standardize string usage for configuration parameters.
- Enhanced MCPClient to improve error handling and logging during memory operations.
- Refactored OpenMemoryMCPService to streamline memory entry conversion and improve metadata handling.
- Improved transcription job handling in transcription_jobs.py for better error reporting and session management.
- Updated mock-services.yml to change model_url for testing compatibility with Docker environments.

* Feat/global record (#281)

* Refactor Home Assistant plugin and MCP client for improved configuration and error handling

- Updated HomeAssistantPlugin to standardize string usage for configuration parameters.
- Enhanced MCPClient to improve error handling and logging during memory operations.
- Refactored OpenMemoryMCPService to streamline memory entry conversion and improve metadata handling.
- Improved transcription job handling in transcription_jobs.py for better error reporting and session management.
- Updated mock-services.yml to change model_url for testing compatibility with Docker environments.

* Add Recording Context and UI Enhancements

- Introduced a new RecordingContext to manage audio recording state and functionality, including start/stop actions and duration tracking.
- Updated various components to utilize the new RecordingContext, replacing previous audio recording hooks for improved consistency.
- Added a GlobalRecordingIndicator component to display recording status across the application.
- Enhanced the Layout component to include the GlobalRecordingIndicator for better user feedback during audio recording sessions.
- Refactored audio-related components to accept the new RecordingContext type, ensuring type safety and clarity in props.
- Implemented configuration options for managing provider segments in transcription, allowing for more flexible audio processing based on user settings.
- Added raw segments JSON display in the Conversations page for better debugging and data visibility.

* Enhance StreamingTranscriptionConsumer and conversation job handling (#282)

- Removed cumulative audio offset tracking from StreamingTranscriptionConsumer as Deepgram provides cumulative timestamps directly.
- Updated store_final_result method to utilize Deepgram's cumulative timestamps without adjustments.
- Implemented completion signaling for transcription sessions in Redis, ensuring conversation jobs wait for all results before processing.
- Improved error handling to signal completion even in case of errors, preventing conversation jobs from hanging.
- Enhanced logging for better visibility of transcription completion and error states.

* fix: config template updated for streaming service as deepgram (#285)

* UPDATE: config template updated for streaming service as deepgram

* UPDATE: script updated for windows machine

* Feat: vibevoice asr (#286)

* Enhance StreamingTranscriptionConsumer and conversation job handling

- Removed cumulative audio offset tracking from StreamingTranscriptionConsumer as Deepgram provides cumulative timestamps directly.
- Updated store_final_result method to utilize Deepgram's cumulative timestamps without adjustments.
- Implemented completion signaling for transcription sessions in Redis, ensuring conversation jobs wait for all results before processing.
- Improved error handling to signal completion even in case of errors, preventing conversation jobs from hanging.
- Enhanced logging for better visibility of transcription completion and error states.

* Enhance ASR services configuration and provider management

- Updated `config.yml.template` to include capabilities for ASR providers, detailing features like word timestamps and speaker segments.
- Added a new `vibevoice` provider configuration for Microsoft VibeVoice ASR, supporting speaker diarization.
- Enhanced `.env.template` with clearer provider selection and model configuration options, including CUDA settings and voice activity detection.
- Improved `docker-compose.yml` to support multiple ASR providers with detailed service configurations.
- Introduced common utilities for audio processing and ASR service management in the `common` module, enhancing code reusability and maintainability.
- Updated `README.md` to reflect the new provider-based architecture and usage instructions for starting different ASR services.

* Enhance transcription provider support and capabilities management

- Added support for the new `vibevoice` transcription provider, including configuration options for built-in speaker diarization.
- Updated `ChronicleSetup` to include `vibevoice` in the transcription provider selection and adjusted related descriptions.
- Enhanced the `ModelDef` and `Conversation` models to reflect the addition of `vibevoice` in provider options.
- Introduced a new capabilities management system to validate provider features, allowing conditional execution of tasks based on provider capabilities.
- Improved logging and user feedback in transcription and speaker recognition jobs to reflect the capabilities of the selected provider.
- Updated documentation to include details on the new `vibevoice` provider and its features.

* Enhance conversation reprocessing and job management

- Introduced a new job for regenerating title and summary after memory processing to ensure fresh context is available.
- Updated the reprocess_transcript and reprocess_speakers functions to enqueue title/summary jobs based on memory job dependencies, improving job chaining and execution order.
- Enhanced validation for transcripts to account for provider capabilities, ensuring proper handling of diarization and segment data.
- Improved logging for job enqueuing and processing stages, providing clearer insights into the workflow and dependencies.

* Enhance Knowledge Graph integration and service management

- Introduced support for Knowledge Graph functionality, enabling entity and relationship extraction from conversations using Neo4j.
- Updated `services.py` to manage Knowledge Graph profiles and integrate with existing service commands.
- Enhanced Docker Compose configurations to include Neo4j service and environment variables for Knowledge Graph setup.
- Added new API routes and models for Knowledge Graph operations, including entity and relationship management.
- Improved documentation and configuration templates to reflect the new Knowledge Graph features and setup instructions.

* Add Knowledge Graph API routes and integrate into backend

- Introduced new `knowledge_graph_routes.py` to handle API endpoints for managing knowledge graph entities, relationships, and promises.
- Updated `__init__.py` to include the new knowledge graph router in the main router module.
- Enhanced documentation to reflect the addition of knowledge graph functionality, improving clarity on available API routes and their purposes.

* Update .gitignore to include individual plugin configuration files and SDK directory

- Added entries for individual plugin config files to ensure user-specific settings are ignored.
- Included the SDK directory in .gitignore to prevent unnecessary files from being tracked.

* Fix: onborading improvements (#287)

* Enhance setup utilities and wizard functionality

- Introduced `detect_tailscale_info` function to automatically retrieve Tailscale DNS name and IP address, improving user experience for service configuration.
- Added `detect_cuda_version` function to identify the system's CUDA version, streamlining compatibility checks for GPU-based services.
- Updated `wizard.py` to utilize the new detection functions, enhancing service selection and configuration processes based on user input.
- Improved error handling and user feedback in service setup, ensuring clearer communication during configuration steps.
- Refactored existing code to improve maintainability and code reuse across setup utilities.

* Update ASR service capabilities and improve speaker identification handling

- Modified the capabilities of the VibeVoice ASR provider to include 'speaker_identification' and 'long_form', enhancing its feature set.
- Adjusted the speaker identification logic in the VibeVoiceTranscriber to prevent double-prefixing and ensure accurate speaker representation.
- Updated protocol tests to reflect the expanded list of known ASR capabilities, ensuring comprehensive validation of reported features.

* Refactor audio recording controls for improved UI and functionality

- Replaced MicOff icon with Square icon in MainRecordingControls and SimplifiedControls for a more intuitive user experience.
- Enhanced button interactions to streamline recording start/stop actions, including a pulsing effect during recording.
- Updated status messages and button states to provide clearer feedback on recording status and actions.
- Improved accessibility by ensuring buttons are disabled appropriately based on recording state and microphone access.

* chore:test docs and test improvements  (#288)

* Enhance test environment setup and configuration

- Added a new interactive setup script for configuring test API keys (Deepgram, OpenAI) to streamline the testing process.
- Introduced a template for the .env.test file to guide users in setting up their API keys.
- Updated the Makefile to include a new 'configure' target for setting up API keys.
- Enhanced the start-containers script to warn users if API keys are still set to placeholder values, improving user awareness during testing.
- Updated .gitignore to include the new .env.test.template file.

* Remove outdated documentation and restructure feature overview

- Deleted the `features.md` file, consolidating its content into the new `overview.md` for a more streamlined documentation structure.
- Updated `init-system.md` to link to the new `overview.md` instead of the removed `features.md`.
- Removed `ports-and-access.md` as its content was integrated into other documentation files, enhancing clarity and reducing redundancy.
- Revised the `README.md` in the advanced backend to reflect the new naming conventions and updated links to documentation.
- Introduced a new `plugin-development-guide.md` to assist users in creating custom plugins, expanding the documentation for developers.

* tech debt

* Enhance ASR service descriptions and provider feedback in wizard.py (#290)

- Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general.
- Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection.

* pre-release v0.2 (#293)

* Enhance ASR service descriptions and provider feedback in wizard.py

- Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general.
- Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection.

* Implement LangFuse integration for observability and prompt management

- Added LangFuse configuration options in the .env.template for observability and prompt management.
- Introduced setup_langfuse method in ChronicleSetup to handle LangFuse initialization and configuration prompts.
- Enhanced prompt management by integrating a centralized PromptRegistry for dynamic prompt retrieval and registration.
- Updated various services to utilize prompts from the PromptRegistry, improving flexibility and maintainability.
- Refactored OpenAI client initialization to support optional LangFuse tracing, enhancing observability during API interactions.
- Added new prompt defaults for memory management and conversation handling, ensuring consistent behavior across the application.

* Enhance LangFuse integration and service management

- Added LangFuse service configuration in services.py and wizard.py, including paths, commands, and descriptions.
- Implemented auto-selection for LangFuse during service setup, improving user experience.
- Enhanced service startup process to display prompt management tips for LangFuse, guiding users on editing AI prompts.
- Updated run_service_setup to handle LangFuse-specific parameters, including admin credentials and API keys, ensuring seamless integration with backend services.

* Feat/better reprocess memory (#300)

* Enhance ASR service descriptions and provider feedback in wizard.py (#290)

- Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general.
- Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection.

* Refactor Obsidian and Knowledge Graph integration in services and setup

- Removed redundant Obsidian and Knowledge Graph configuration checks from services.py, streamlining the command execution process.
- Updated wizard.py to enhance user experience by setting default options for speaker recognition during service selection.
- Improved Neo4j password handling in setup processes, ensuring consistent configuration prompts and feedback.
- Introduced a new cron scheduler for managing scheduled tasks, enhancing the backend's automation capabilities.
- Added new entity annotation features, allowing for corrections and updates to knowledge graph entities directly through the API.

* Enhance ASR services configuration and VibeVoice integration

- Added new configuration options for VibeVoice ASR in defaults.yml, including batching parameters for audio processing.
- Updated Docker Compose files to mount the config directory, ensuring access to ASR service configurations.
- Enhanced the VibeVoice transcriber to load configuration settings from defaults.yml, allowing for dynamic adjustments via environment variables.
- Introduced quantization options for model loading in the VibeVoice transcriber, improving performance and flexibility.
- Refactored the speaker identification process to streamline audio handling and improve logging for better debugging.
- Updated documentation to reflect new configuration capabilities and usage instructions for the VibeVoice ASR provider.

* Enhance LangFuse integration and memory reprocessing capabilities

- Introduced functions for checking LangFuse configuration in services.py, ensuring proper setup for observability.
- Updated wizard.py to facilitate user input for LangFuse configuration, including options for local and external setups.
- Implemented memory reprocessing logic in memory services to update existing memories based on speaker re-identification.
- Enhanced speaker recognition client to support per-segment identification, improving accuracy during reprocessing.
- Refactored various components to streamline handling of LangFuse parameters and improve overall service management.

* Enhance service management and user input handling

- Updated services.py to include LangFuse configuration checks during service startup, improving observability setup.
- Refactored wizard.py to utilize a masked input for Neo4j password prompts, enhancing user experience and security.
- Improved cron scheduler in advanced_omi_backend to manage active tasks and validate cron expressions, ensuring robust job execution.
- Enhanced speaker recognition client documentation to clarify user_id limitations, preparing for future multi-user support.
- Updated knowledge graph routes to enforce validation on entity updates, ensuring at least one field is provided for updates.

* fix: Plugin System Refactor (#301)

* Refactor connect-omi.py for improved device selection and user interaction

- Replaced references to the chronicle Bluetooth library with friend_lite for device management.
- Removed the list_devices function and implemented a new prompt_user_to_pick_device function to enhance user interaction when selecting OMI/Neo devices.
- Updated the find_and_set_omi_mac function to utilize the new device selection method, improving the overall flow of device connection.
- Added a new scan_devices.py script for quick scanning of neo/neosapien devices, enhancing usability.
- Updated README.md to reflect new usage instructions and prerequisites for connecting to OMI devices over Bluetooth.
- Enhanced start.sh to ensure proper environment variable setup for macOS users.

* Add friend-lite-sdk: Initial implementation of Python SDK for OMI/Friend Lite BLE devices

- Introduced the friend-lite-sdk, a Python SDK for OMI/Friend Lite BLE devices, enabling audio streaming, button events, and transcription functionalities.
- Added LICENSE and NOTICE files to clarify licensing and attribution.
- Created pyproject.toml for package management, specifying dependencies and project metadata.
- Developed core modules including bluetooth connection handling, button event parsing, audio decoding, and transcription capabilities.
- Implemented example usage in README.md to guide users on installation and basic functionality.
- Enhanced connect-omi.py to utilize the new SDK for improved device management and event handling.
- Updated requirements.txt to reference the new SDK for local development.

This commit lays the foundation for further enhancements and integrations with OMI devices.

* Enhance client state and plugin architecture for button event handling

- Introduced a new `markers` list in `ClientState` to collect button event data during sessions.
- Added `add_marker` method to facilitate the addition of markers to the current session.
- Implemented `on_button_event` method in the `BasePlugin` class to handle device button events, providing context data for button state and timestamps.
- Updated `PluginRouter` to route button events to the appropriate plugin handler.
- Enhanced conversation job handling to attach markers from Redis sessions, improving the tracking of button events during conversations.

* Move plugins locatino

- Introduced the Email Summarizer plugin that automatically sends email summaries upon conversation completion.
- Implemented SMTP email service for sending formatted HTML and plain text emails.
- Added configuration options for SMTP settings and email content in `config.yml`.
- Created setup script for easy configuration of SMTP credentials and plugin orchestration.
- Enhanced documentation with usage instructions and troubleshooting tips for the plugin.
- Updated existing plugin architecture to support new event handling for email summaries.

* Enhance Docker Compose and Plugin Management

- Added external plugins directory to Docker Compose files for better plugin management.
- Updated environment variables for MongoDB and Redis services to ensure consistent behavior.
- Introduced new dependencies in `uv.lock` for improved functionality.
- Refactored audio processing to support various audio formats and enhance error handling.
- Implemented new plugin event types and services for better integration and communication between plugins.
- Enhanced conversation and session management to support new closing mechanisms and event logging.

* Update audio processing and event logging

- Increased the maximum event log size in PluginRouter from 200 to 1000 for improved event tracking.
- Refactored audio stream producer to dynamically read audio format from Redis session metadata, enhancing flexibility in audio handling.
- Updated transcription job processing to utilize session-specific audio format settings, ensuring accurate audio processing.
- Enhanced audio file writing utility to accept PCM parameters, allowing for better control over audio data handling.

* Add markers list to ClientState and update timeout trigger comment

- Introduced a new `markers` list in `ClientState` to track button event data during conversations.
- Updated comment in `open_conversation_job` to clarify the behavior of the `timeout_triggered` variable, ensuring better understanding of session management.

* Refactor audio file logging and error handling

- Updated audio processing logs to consistently use the `filename` variable instead of `file.filename` for clarity.
- Enhanced error logging to utilize the `filename` variable, improving traceability of issues during audio processing.
- Adjusted title generation logic to handle cases where the filename is "unknown," ensuring a default title is used.
- Minor refactor in conversation closing logs to use `user.user_id` for better consistency in user identification.

* Enhance conversation retrieval with pagination and orphan handling

- Updated `get_conversations` function to support pagination through `limit` and `offset` parameters, improving performance for large datasets.
- Consolidated query logic to fetch both normal and orphan conversations in a single database call, reducing round-trips and enhancing efficiency.
- Modified the response structure to include total count, limit, and offset in the returned data for better client-side handling.
- Adjusted database indexing to optimize queries for paginated results, ensuring faster access to conversation data.

* Refactor connection logging in transcribe function

- Moved connection logging for the Wyoming server to a more structured format within the `transcribe_wyoming` function.
- Ensured that connection attempts and successes are logged consistently for better traceability during audio transcription processes.

* Feat/neo sdk (#302)

* Update friend-lite-sdk for Neo1 device support and enhance documentation

- Updated the friend-lite-sdk to version 0.3.0, reflecting the transition to support OMI/Neo1 BLE wearable devices.
- Refactored the Bluetooth connection handling to introduce a new `WearableConnection` class, enhancing the connection lifecycle management for wearable devices.
- Added a new `Neo1Connection` class for controlling Neo1 devices, including methods for sleep and wake functionalities.
- Updated UUID constants to include Neo1-specific characteristics, improving device interaction capabilities.
- Revised the plugin development guide to reflect changes in device naming and connection processes.
- Removed outdated local OMI Bluetooth scripts and documentation to streamline the project structure and focus on wearable client development.

* Refactor backend audio streaming to use Opus codec and enhance menu app functionality

- Updated backend_sender.py to stream raw Opus audio instead of PCM, improving bandwidth efficiency.
- Modified stream_to_backend function to handle Opus audio data and adjusted audio chunk parameters accordingly.
- Enhanced main.py with new CLI commands for device scanning and connection management, improving user experience.
- Introduced menu_app.py for a macOS menu bar application, providing a user-friendly interface for device management and status display.
- Added README.md to document usage instructions and configuration details for the local wearable client.
- Updated requirements.txt to include new dependencies for the menu app and service management.
- Implemented service.py for managing launchd service installation and configuration on macOS, enabling auto-start on login.

* Refactor audio processing and queue management in local wearable client

- Removed the audio queue in favor of a dedicated BLE data queue and backend queue for improved data handling.
- Enhanced the `connect_and_stream` function to streamline audio decoding and writing to the local file sink.
- Updated the handling of BLE data to ensure robust queue management and error logging.
- Improved task management during device disconnection to ensure proper cleanup and error handling.
- Updated requirements.txt to specify a minimum version for easy_audio_interfaces, ensuring compatibility.

* refactor: kitchen sink (#303)

* Enhance ASR service descriptions and provider feedback in wizard.py

- Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general.
- Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection.

* Implement LangFuse integration for observability and prompt management

- Added LangFuse configuration options in the .env.template for observability and prompt management.
- Introduced setup_langfuse method in ChronicleSetup to handle LangFuse initialization and configuration prompts.
- Enhanced prompt management by integrating a centralized PromptRegistry for dynamic prompt retrieval and registration.
- Updated various services to utilize prompts from the PromptRegistry, improving flexibility and maintainability.
- Refactored OpenAI client initialization to support optional LangFuse tracing, enhancing observability during API interactions.
- Added new prompt defaults for memory management and conversation handling, ensuring consistent behavior across the application.

* Enhance LangFuse integration and service management

- Added LangFuse service configuration in services.py and wizard.py, including paths, commands, and descriptions.
- Implemented auto-selection for LangFuse during service setup, improving user experience.
- Enhanced service startup process to display prompt management tips for LangFuse, guiding users on editing AI prompts.
- Updated run_service_setup to handle LangFuse-specific parameters, including admin credentials and API keys, ensuring seamless integration with backend services.

* Feat/better reprocess memory (#300)

* Enhance ASR service descriptions and provider feedback in wizard.py (#290)

- Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general.
- Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection.

* Refactor Obsidian and Knowledge Graph integration in services and setup

- Removed redundant Obsidian and Knowledge Graph configuration checks from services.py, streamlining the command execution process.
- Updated wizard.py to enhance user experience by setting default options for speaker recognition during service selection.
- Improved Neo4j password handling in setup processes, ensuring consistent configuration prompts and feedback.
- Introduced a new cron scheduler for managing scheduled tasks, enhancing the backend's automation capabilities.
- Added new entity annotation features, allowing for corrections and updates to knowledge graph entities directly through the API.

* Enhance ASR services configuration and VibeVoice integration

- Added new configuration options for VibeVoice ASR in defaults.yml, including batching parameters for audio processing.
- Updated Docker Compose files to mount the config directory, ensuring access to ASR service configurations.
- Enhanced the VibeVoice transcriber to load configuration settings from defaults.yml, allowing for dynamic adjustments via environment variables.
- Introduced quantization options for model loading in the VibeVoice transcriber, improving performance and flexibility.
- Refactored the speaker identification process to streamline audio handling and improve logging for better debugging.
- Updated documentation to reflect new configuration capabilities and usage instructions for the VibeVoice ASR provider.

* Enhance LangFuse integration and memory reprocessing capabilities

- Introduced functions for checking LangFuse configuration in services.py, ensuring proper setup for observability.
- Updated wizard.py to facilitate user input for LangFuse configuration, including options for local and external setups.
- Implemented memory reprocessing logic in memory services to update existing memories based on speaker re-identification.
- Enhanced speaker recognition client to support per-segment identification, improving accuracy during reprocessing.
- Refactored various components to streamline handling of LangFuse parameters and improve overall service management.

* Enhance service management and user input handling

- Updated services.py to include LangFuse configuration checks during service startup, improving observability setup.
- Refactored wizard.py to utilize a masked input for Neo4j password prompts, enhancing user experience and security.
- Improved cron scheduler in advanced_omi_backend to manage active tasks and validate cron expressions, ensuring robust job execution.
- Enhanced speaker recognition client documentation to clarify user_id limitations, preparing for future multi-user support.
- Updated knowledge graph routes to enforce validation on entity updates, ensuring at least one field is provided for updates.

* fix: Plugin System Refactor (#301)

* Refactor connect-omi.py for improved device selection and user interaction

- Replaced references to the chronicle Bluetooth library with friend_lite for device management.
- Removed the list_devices function and implemented a new prompt_user_to_pick_device function to enhance user interaction when selecting OMI/Neo devices.
- Updated the find_and_set_omi_mac function to utilize the new device selection method, improving the overall flow of device connection.
- Added a new scan_devices.py script for quick scanning of neo/neosapien devices, enhancing usability.
- Updated README.md to reflect new usage instructions and prerequisites for connecting to OMI devices over Bluetooth.
- Enhanced start.sh to ensure proper environment variable setup for macOS users.

* Add friend-lite-sdk: Initial implementation of Python SDK for OMI/Friend Lite BLE devices

- Introduced the friend-lite-sdk, a Python SDK for OMI/Friend Lite BLE devices, enabling audio streaming, button events, and transcription functionalities.
- Added LICENSE and NOTICE files to clarify licensing and attribution.
- Created pyproject.toml for package management, specifying dependencies and project metadata.
- Developed core modules including bluetooth connection handling, button event parsing, audio decoding, and transcription capabilities.
- Implemented example usage in README.md to guide users on installation and basic functionality.
- Enhanced connect-omi.py to utilize the new SDK for improved device management and event handling.
- Updated requirements.txt to reference the new SDK for local development.

This commit lays the foundation for further enhancements and integrations with OMI devices.

* Enhance client state and plugin architecture for button event handling

- Introduced a new `markers` list in `ClientState` to collect button event data during sessions.
- Added `add_marker` method to facilitate the addition of markers to the current session.
- Implemented `on_button_event` method in the `BasePlugin` class to handle device button events, providing context data for button state and timestamps.
- Updated `PluginRouter` to route button events to the appropriate plugin handler.
- Enhanced conversation job handling to attach markers from Redis sessions, improving the tracking of button events during conversations.

* Move plugins locatino

- Introduced the Email Summarizer plugin that automatically sends email summaries upon conversation completion.
- Implemented SMTP email service for sending formatted HTML and plain text emails.
- Added configuration options for SMTP settings and email content in `config.yml`.
- Created setup script for easy configuration of SMTP credentials and plugin orchestration.
- Enhanced documentation with usage instructions and troubleshooting tips for the plugin.
- Updated existing plugin architecture to support new event handling for email summaries.

* Enhance Docker Compose and Plugin Management

- Added external plugins directory to Docker Compose files for better plugin management.
- Updated environment variables for MongoDB and Redis services to ensure consistent behavior.
- Introduced new dependencies in `uv.lock` for improved functionality.
- Refactored audio processing to support various audio formats and enhance error handling.
- Implemented new plugin event types and services for better integration and communication between plugins.
- Enhanced conversation and session management to support new closing mechanisms and event logging.

* Update audio processing and event logging

- Increased the maximum event log size in PluginRouter from 200 to 1000 for improved event tracking.
- Refactored audio stream producer to dynamically read audio format from Redis session metadata, enhancing flexibility in audio handling.
- Updated transcription job processing to utilize session-specific audio format settings, ensuring accurate audio processing.
- Enhanced audio file writing utility to accept PCM parameters, allowing for better control over audio data handling.

* Add markers list to ClientState and update timeout trigger comment

- Introduced a new `markers` list in `ClientState` to track button event data during conversations.
- Updated comment in `open_conversation_job` to clarify the behavior of the `timeout_triggered` variable, ensuring better understanding of session management.

* Refactor audio file logging and error handling

- Updated audio processing logs to consistently use the `filename` variable instead of `file.filename` for clarity.
- Enhanced error logging to utilize the `filename` variable, improving traceability of issues during audio processing.
- Adjusted title generation logic to handle cases where the filename is "unknown," ensuring a default title is used.
- Minor refactor in conversation closing logs to use `user.user_id` for better consistency in user identification.

* Enhance conversation retrieval with pagination and orphan handling

- Updated `get_conversations` function to support pagination through `limit` and `offset` parameters, improving performance for large datasets.
- Consolidated query logic to fetch both normal and orphan conversations in a single database call, reducing round-trips and enhancing efficiency.
- Modified the response structure to include total count, limit, and offset in the returned data for better client-side handling.
- Adjusted database indexing to optimize queries for paginated results, ensuring faster access to conversation data.

* Refactor connection logging in transcribe function

- Moved connection logging for the Wyoming server to a more structured format within the `transcribe_wyoming` function.
- Ensured that connection attempts and successes are logged consistently for better traceability during audio transcription processes.

* Refactor configuration management and enhance plugin architecture

- Replaced PyYAML with ruamel.yaml for improved YAML handling, preserving quotes and enhancing configuration loading.
- Updated ConfigManager to utilize ruamel.yaml for loading and saving configuration files, ensuring better error handling and validation.
- Enhanced service startup messages to display access URLs for backend services, improving user experience.
- Introduced new plugin health tracking in PluginRouter, allowing for better monitoring of plugin initialization and error states.
- Refactored audio stream client and conversation management to streamline audio processing and improve error handling.
- Updated Docker and requirements configurations to include ruamel.yaml, ensuring compatibility across environments.

* refactor clean up script

* cleanup partial mycelia integration

* Refactor configuration management and remove Mycelia integration

- Updated ConfigManager to remove references to the Mycelia memory provider, simplifying the memory provider options to only include "chronicle" and "openmemory_mcp".
- Cleaned up Makefile by removing Mycelia-related targets and help descriptions, streamlining the build process.
- Enhanced cleanup script documentation for clarity on usage and options.
- Introduced LLM operation configurations to improve model management and prompt optimization capabilities.

* Refactor Docker and cleanup scripts to remove 'uv' command usage

- Updated cleanup.sh to directly execute the Python script without 'uv' command.
- Modified Docker Compose files to remove 'uv run' from service commands, simplifying execution.
- Enhanced start.sh to reflect changes in command usage and improve clarity in usage instructions.
- Introduced a new transcription job timeout configuration in the backend, allowing for dynamic timeout settings.
- Added insert annotation functionality in the API, enabling users to insert new segments in conversations.
- Implemented memory retrieval for conversations, enhancing the ability to fetch related memories.
- Improved error handling and logging across various modules for better traceability and debugging.

* Add backend worker health check and job clearing functionality

- Introduced a new function `get_backend_worker_health` to retrieve health metrics from the backend's /health endpoint, including worker count and queue status.
- Updated `show_quick_status` to display worker health information, alerting users to potential issues with registered workers.
- Added a new API endpoint `/jobs` to allow admin users to clear finished and failed jobs from all queues, enhancing job management capabilities.
- Updated the frontend Queue component to include a button for clearing jobs, improving user interaction and management of job statuses.

* Update plugin event descriptions and refactor event handling

- Reduced redundancy by embedding descriptions directly within the PluginEvent enum, enhancing clarity and maintainability.
- Removed the EVENT_DESCRIPTIONS dictionary, streamlining the event handling process in the plugin assistant.
- Updated references in the plugin assistant to utilize the new description attributes, ensuring consistent event metadata usage.

* update lock file

* webui fix

* Enhance speaker recognition error handling and reporting

- Introduced error counting and detailed error status reporting in the SpeakerRecognitionClient, allowing for better tracking of identification failures.
- Updated the result structure to include error messages when all identification requests fail, improving user feedback on service health.
- Adjusted the speaker_jobs worker to incorporate partial error reporting in metadata, enhancing the overall robustness of speaker recognition processes.
- Updated Dockerfile to use a newer Python base image for improved compatibility.

* Refactor EmailSummarizerPlugin to utilize Beanie model for conversation retrieval

- Updated conversation fetching logic to use the Beanie model instead of direct database queries, enhancing code clarity and maintainability.
- Simplified transcript and summary retrieval by leveraging computed properties from the Conversation model.
- Improved error handling for missing conversations, transcripts, and summaries, ensuring better user feedback during email processing.

* Refactor EmailSummarizerPlugin to remove database dependency and streamline email handling

- Eliminated MongoDB database handle and related user email retrieval logic, simplifying the email sending process.
- Updated user email acquisition to directly use the configured SMTP username, enhancing clarity and reducing potential errors.
- Improved error messaging for missing SMTP configuration, ensuring better feedback during email delivery attempts.

* Refactor transcription job handling and email summarization plugin

- Removed the title and summary generation logic from the transcription job, delegating this responsibility to the `generate_title_summary_job` after speaker recognition.
- Updated the `EmailSummarizerPlugin` to trigger email summaries after all conversation processing is complete, ensuring final titles and summaries are included.
- Enhanced error handling and logging for the email summarization process, improving feedback during email delivery attempts.

* Enhance BLE device scanning and connection management

- Refactored the device scanning logic to return all matching known or auto-discovered devices, improving flexibility in device selection.
- Introduced a new interactive prompt for users to select from multiple discovered devices, enhancing user experience.
- Updated the connection handling to support specific MAC address targeting, allowing for more precise device management.
- Improved the backend streaming URI construction by URL-encoding the device name, ensuring compatibility with special characters.

* Add device configuration template and enhance BLE device management

- Introduced a new `devices.yml.template` for configuring known wearable devices, facilitating easier setup for users.
- Updated the main application to automatically create a `devices.yml` from the template if it doesn't exist, improving user experience.
- Enhanced BLE scanning logic to utilize advertisement data for better device identification and management.
- Implemented functionality to save the last connected device's MAC address in the configuration, allowing for seamless reconnections.
- Minor adjustments to the menu application for improved logging and user feedback during device management.

* Implement button event handling and plugin connectivity checks

- Added support for handling button events in the websocket controller, allowing for real-time interaction with button states.
- Introduced a health check method in the BasePlugin class to verify connectivity with external services, enhancing plugin reliability.
- Implemented a connectivity check endpoint in the system routes to provide live status updates for all initialized plugins.
- Updated the PluginRouter to handle button events and ensure they bypass transcript-based conditions for execution.
- Enhanced the frontend to display connectivity status for plugins, improving user experience and monitoring capabilities.

* Refactor conversation search logic to use pymongo collection

- Updated the conversation search function to utilize the pymongo collection instead of the motor collection, improving database interaction consistency.
- Adjusted the match filter to ensure proper querying of non-deleted conversations during full-text search operations.

* local-client-improvments

* Add battery level monitoring to BLE connections

- Introduced battery level UUIDs and methods for reading and subscribing to battery level notifications in the Bluetooth connection class.
- Updated the connection handling to log and manage battery level updates, enhancing user awareness of device status.
- Modified the main application to display battery level information in the connection status, improving user experience.

* sink

qwen
rewrite streaming pipe
some other stuff

* Enhance conversation management and star functionality

- Added support for starring conversations, allowing users to mark important discussions.
- Updated conversation model to include 'starred' status and timestamp.
- Implemented API endpoints for toggling star status and retrieving starred conversations.
- Enhanced conversation retrieval logic to support filtering by starred status and sorting options.
- Improved frontend components to display and manage starred conversations effectively.

* Add installation script and update README for setup instructions

- Introduced a new `install.sh` script to automate the installation of the Chronicle application, including cloning the latest release and installing dependencies.
- Updated the README to include a quick start command for running the installation script, enhancing user onboarding experience.

---------

Co-authored-by: 01PrathamS <pratham21btai35@karnavatiuniversity.edu.in>
Co-authored-by: Stu Alexandere <thestumonkey@gmail.com>
Co-authored-by: Stuart Alexander <stu@theawesome.co.uk>
Co-authored-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Co-authored-by: AJASU <junyoung090928@gmail.com>
Co-authored-by: Roshan John <63011948+roshatron2@users.noreply.github.com>
Co-authored-by: roshan.john <roshanjohn1460@gmail.com>
Co-authored-by: Pratham Savaliya <103353318+01PrathamS@users.noreply.github.com>
---
 README.md  | 10 ++++++++--
 install.sh | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+), 2 deletions(-)
 create mode 100755 install.sh

diff --git a/README.md b/README.md
index 7e342210..76387aa7 100644
--- a/README.md
+++ b/README.md
@@ -2,9 +2,15 @@
 
 Self-hostable AI system that captures audio/video data from OMI devices and other sources to generate memories, action items, and contextual insights about your conversations and daily interactions.
 
-## Quick Start → [Get Started](quickstart.md)
+## Quick Start
 
-Run setup wizard, start services, access at http://localhost:5173
+```bash
+curl -fsSL https://raw.githubusercontent.com/SimpleOpenSoftware/chronicle/main/install.sh | sh
+```
+
+This clones the latest release, installs dependencies, and launches the interactive setup wizard.
+
+For step-by-step instructions, see the [setup guide](quickstart.md).
 
 ## Screenshots
 
diff --git a/install.sh b/install.sh
new file mode 100755
index 00000000..8baf78e0
--- /dev/null
+++ b/install.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+set -e
+
+REPO="https://github.com/SimpleOpenSoftware/chronicle.git"
+DIR="chronicle"
+
+# Get latest release tag
+TAG=$(curl -sL https://api.github.com/repos/SimpleOpenSoftware/chronicle/releases/latest | grep -o '"tag_name": *"[^"]*"' | head -1 | cut -d'"' -f4)
+
+if [ -z "$TAG" ]; then
+    echo "error: could not determine latest release"
+    exit 1
+fi
+
+echo "Installing Chronicle $TAG..."
+
+if [ -d "$DIR" ]; then
+    echo "error: directory '$DIR' already exists"
+    exit 1
+fi
+
+git clone --depth 1 --branch "$TAG" "$REPO" "$DIR"
+cd "$DIR"
+
+# Install uv if missing
+if ! command -v uv > /dev/null 2>&1; then
+    echo "Installing uv package manager..."
+    curl -LsSf https://astral.sh/uv/install.sh | sh
+    . "$HOME/.local/bin/env" 2>/dev/null || export PATH="$HOME/.local/bin:$PATH"
+fi
+
+# Reconnect stdin for interactive wizard
+exec < /dev/tty
+./wizard.sh

From 86ec51596cdf6d9d035b86515d936e2af15020e2 Mon Sep 17 00:00:00 2001
From: 0xrushi <0xrushi@gmail.com>
Date: Wed, 25 Feb 2026 22:28:24 -0500
Subject: [PATCH 2/2] wizard update

---
 backends/advanced/init.py          | 117 ++++++++---
 tests/unit/test_wizard_defaults.py | 259 ++++++++++++++++++++++++
 wizard.py                          | 304 +++++++++++++++++++++++++----
 3 files changed, 609 insertions(+), 71 deletions(-)
 create mode 100644 tests/unit/test_wizard_defaults.py

diff --git a/backends/advanced/init.py b/backends/advanced/init.py
index a1448876..da9b61d7 100644
--- a/backends/advanced/init.py
+++ b/backends/advanced/init.py
@@ -279,7 +279,8 @@ def setup_transcription(self):
 
         elif choice == "2":
             self.console.print("[blue][INFO][/blue] Offline Parakeet ASR selected")
-            parakeet_url = self.prompt_value("Parakeet ASR URL", "http://host.docker.internal:8767")
+            existing_parakeet_url = read_env_value('.env', 'PARAKEET_ASR_URL') or "http://host.docker.internal:8767"
+            parakeet_url = self.prompt_value("Parakeet ASR URL", existing_parakeet_url)
 
             # Write URL to .env for ${PARAKEET_ASR_URL} placeholder in config.yml
             self.config["PARAKEET_ASR_URL"] = parakeet_url
@@ -293,7 +294,8 @@ def setup_transcription(self):
 
         elif choice == "3":
             self.console.print("[blue][INFO][/blue] Offline VibeVoice ASR selected (built-in speaker diarization)")
-            vibevoice_url = self.prompt_value("VibeVoice ASR URL", "http://host.docker.internal:8767")
+            existing_vibevoice_url = read_env_value('.env', 'VIBEVOICE_ASR_URL') or "http://host.docker.internal:8767"
+            vibevoice_url = self.prompt_value("VibeVoice ASR URL", existing_vibevoice_url)
 
             # Write URL to .env for ${VIBEVOICE_ASR_URL} placeholder in config.yml
             self.config["VIBEVOICE_ASR_URL"] = vibevoice_url
@@ -308,7 +310,9 @@ def setup_transcription(self):
 
         elif choice == "4":
             self.console.print("[blue][INFO][/blue] Qwen3-ASR selected (52 languages, streaming + batch via vLLM)")
-            qwen3_url = self.prompt_value("Qwen3-ASR URL", "http://host.docker.internal:8767")
+            existing_qwen3_url_raw = read_env_value('.env', 'QWEN3_ASR_URL')
+            existing_qwen3_url = f"http://{existing_qwen3_url_raw}" if existing_qwen3_url_raw else "http://host.docker.internal:8767"
+            qwen3_url = self.prompt_value("Qwen3-ASR URL", existing_qwen3_url)
 
             # Write URL to .env for ${QWEN3_ASR_URL} placeholder in config.yml
             self.config["QWEN3_ASR_URL"] = qwen3_url.replace("http://", "").rstrip("/")
@@ -429,18 +433,32 @@ def setup_streaming_provider(self):
 
     def setup_llm(self):
         """Configure LLM provider - updates config.yml and .env"""
-        self.print_section("LLM Provider Configuration")
-
-        self.console.print("[blue][INFO][/blue] LLM configuration will be saved to config.yml")
-        self.console.print()
+        # Check if LLM provider was provided via command line (from wizard.py)
+        if hasattr(self.args, 'llm_provider') and self.args.llm_provider:
+            provider = self.args.llm_provider
+            self.console.print(f"[green]✅[/green] LLM provider: {provider} (configured via wizard)")
+            choice = {"openai": "1", "ollama": "2", "none": "3"}.get(provider, "1")
+        else:
+            # Standalone init.py run — read existing config as default
+            existing_choice = "1"
+            full_config = self.config_manager.get_full_config()
+            existing_llm = full_config.get("defaults", {}).get("llm", "")
+            if existing_llm == "local-llm":
+                existing_choice = "2"
+            elif existing_llm == "openai-llm":
+                existing_choice = "1"
+
+            self.print_section("LLM Provider Configuration")
+            self.console.print("[blue][INFO][/blue] LLM configuration will be saved to config.yml")
+            self.console.print()
 
-        choices = {
-            "1": "OpenAI (GPT-4, GPT-3.5 - requires API key)",
-            "2": "Ollama (local models - runs locally)",
-            "3": "Skip (no memory extraction)"
-        }
+            choices = {
+                "1": "OpenAI (GPT-4, GPT-3.5 - requires API key)",
+                "2": "Ollama (local models - runs locally)",
+                "3": "Skip (no memory extraction)"
+            }
 
-        choice = self.prompt_choice("Which LLM provider will you use?", choices, "1")
+            choice = self.prompt_choice("Which LLM provider will you use?", choices, existing_choice)
 
         if choice == "1":
             self.console.print("[blue][INFO][/blue] OpenAI selected")
@@ -481,14 +499,27 @@ def setup_llm(self):
 
     def setup_memory(self):
         """Configure memory provider - updates config.yml"""
-        self.print_section("Memory Storage Configuration")
+        # Check if memory provider was provided via command line (from wizard.py)
+        if hasattr(self.args, 'memory_provider') and self.args.memory_provider:
+            provider = self.args.memory_provider
+            self.console.print(f"[green]✅[/green] Memory provider: {provider} (configured via wizard)")
+            choice = {"chronicle": "1", "openmemory_mcp": "2"}.get(provider, "1")
+        else:
+            # Standalone init.py run — read existing config as default
+            existing_choice = "1"
+            full_config = self.config_manager.get_full_config()
+            existing_provider = full_config.get("memory", {}).get("provider", "chronicle")
+            if existing_provider == "openmemory_mcp":
+                existing_choice = "2"
 
-        choices = {
-            "1": "Chronicle Native (Qdrant + custom extraction)",
-            "2": "OpenMemory MCP (cross-client compatible, external server)",
-        }
+            self.print_section("Memory Storage Configuration")
 
-        choice = self.prompt_choice("Choose your memory storage backend:", choices, "1")
+            choices = {
+                "1": "Chronicle Native (Qdrant + custom extraction)",
+                "2": "OpenMemory MCP (cross-client compatible, external server)",
+            }
+
+            choice = self.prompt_choice("Choose your memory storage backend:", choices, existing_choice)
 
         if choice == "1":
             self.console.print("[blue][INFO][/blue] Chronicle Native memory provider selected")
@@ -575,21 +606,30 @@ def setup_neo4j(self):
 
     def setup_obsidian(self):
         """Configure Obsidian integration (optional feature flag only - Neo4j credentials handled by setup_neo4j)"""
-        if hasattr(self.args, 'enable_obsidian') and self.args.enable_obsidian:
+        has_enable = hasattr(self.args, 'enable_obsidian') and self.args.enable_obsidian
+        has_disable = hasattr(self.args, 'no_obsidian') and self.args.no_obsidian
+
+        if has_enable:
             enable_obsidian = True
             self.console.print(f"[green]✅[/green] Obsidian: enabled (configured via wizard)")
+        elif has_disable:
+            enable_obsidian = False
+            self.console.print(f"[blue][INFO][/blue] Obsidian: disabled (configured via wizard)")
         else:
-            # Interactive prompt (fallback)
+            # Standalone init.py run — read existing config as default
+            full_config = self.config_manager.get_full_config()
+            existing_enabled = full_config.get("memory", {}).get("obsidian", {}).get("enabled", False)
+
             self.console.print()
             self.console.print("[bold cyan]Obsidian Integration (Optional)[/bold cyan]")
             self.console.print("Enable graph-based knowledge management for Obsidian vault notes")
             self.console.print()
 
             try:
-                enable_obsidian = Confirm.ask("Enable Obsidian integration?", default=False)
+                enable_obsidian = Confirm.ask("Enable Obsidian integration?", default=existing_enabled)
             except EOFError:
-                self.console.print("Using default: No")
-                enable_obsidian = False
+                self.console.print(f"Using default: {'Yes' if existing_enabled else 'No'}")
+                enable_obsidian = existing_enabled
 
         if enable_obsidian:
             self.config_manager.update_memory_config({
@@ -612,19 +652,30 @@ def setup_obsidian(self):
 
     def setup_knowledge_graph(self):
         """Configure Knowledge Graph (Neo4j-based entity/relationship extraction - enabled by default)"""
-        if hasattr(self.args, 'enable_knowledge_graph') and self.args.enable_knowledge_graph:
+        has_enable = hasattr(self.args, 'enable_knowledge_graph') and self.args.enable_knowledge_graph
+        has_disable = hasattr(self.args, 'no_knowledge_graph') and self.args.no_knowledge_graph
+
+        if has_enable:
             enable_kg = True
+            self.console.print(f"[green]✅[/green] Knowledge Graph: enabled (configured via wizard)")
+        elif has_disable:
+            enable_kg = False
+            self.console.print(f"[blue][INFO][/blue] Knowledge Graph: disabled (configured via wizard)")
         else:
+            # Standalone init.py run — read existing config as default
+            full_config = self.config_manager.get_full_config()
+            existing_enabled = full_config.get("memory", {}).get("knowledge_graph", {}).get("enabled", True)
+
             self.console.print()
             self.console.print("[bold cyan]Knowledge Graph (Entity Extraction)[/bold cyan]")
             self.console.print("Extract people, places, organizations, events, and tasks from conversations")
             self.console.print()
 
             try:
-                enable_kg = Confirm.ask("Enable Knowledge Graph?", default=True)
+                enable_kg = Confirm.ask("Enable Knowledge Graph?", default=existing_enabled)
             except EOFError:
-                self.console.print("Using default: Yes")
-                enable_kg = True
+                self.console.print(f"Using default: {'Yes' if existing_enabled else 'No'}")
+                enable_kg = existing_enabled
 
         if enable_kg:
             self.config_manager.update_memory_config({
@@ -1041,6 +1092,16 @@ def main():
     parser.add_argument("--streaming-provider",
                        choices=["deepgram", "smallest", "qwen3-asr"],
                        help="Streaming provider when different from batch (enables batch re-transcription)")
+    parser.add_argument("--llm-provider",
+                       choices=["openai", "ollama", "none"],
+                       help="LLM provider for memory extraction (default: prompt user)")
+    parser.add_argument("--memory-provider",
+                       choices=["chronicle", "openmemory_mcp"],
+                       help="Memory storage backend (default: prompt user)")
+    parser.add_argument("--no-obsidian", action="store_true",
+                       help="Explicitly disable Obsidian integration (complementary to --enable-obsidian)")
+    parser.add_argument("--no-knowledge-graph", action="store_true",
+                       help="Explicitly disable Knowledge Graph (complementary to --enable-knowledge-graph)")
 
     args = parser.parse_args()
     
diff --git a/tests/unit/test_wizard_defaults.py b/tests/unit/test_wizard_defaults.py
new file mode 100644
index 00000000..3d4a1f45
--- /dev/null
+++ b/tests/unit/test_wizard_defaults.py
@@ -0,0 +1,259 @@
+"""Test wizard.py helper functions for loading previous config as defaults.
+
+Tests for the functions that read config/config.yml to pre-populate wizard
+prompts with previously-configured values, so re-runs default to existing
+settings.
+"""
+
+import pytest
+import yaml
+from pathlib import Path
+from unittest.mock import patch, MagicMock
+
+
+# ---------------------------------------------------------------------------
+# Import the pure helper functions directly from wizard.py.
+# wizard.py lives at the project root, not inside a package, so we import
+# via importlib with an explicit path to avoid adding the root to sys.path
+# permanently.
+# ---------------------------------------------------------------------------
+
+import importlib.util
+import sys
+
+WIZARD_PATH = Path(__file__).parent.parent.parent / "wizard.py"
+PROJECT_ROOT = str(WIZARD_PATH.parent)
+
+
+def _load_wizard():
+    # wizard.py and setup_utils.py both live in the project root.
+    # Add the root to sys.path so the relative import resolves.
+    if PROJECT_ROOT not in sys.path:
+        sys.path.insert(0, PROJECT_ROOT)
+    spec = importlib.util.spec_from_file_location("wizard", WIZARD_PATH)
+    mod = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(mod)
+    return mod
+
+
+# Load once and reuse
+_wizard = _load_wizard()
+
+read_config_yml = _wizard.read_config_yml
+get_existing_stt_provider = _wizard.get_existing_stt_provider
+get_existing_stream_provider = _wizard.get_existing_stream_provider
+select_llm_provider = _wizard.select_llm_provider
+select_memory_provider = _wizard.select_memory_provider
+select_knowledge_graph = _wizard.select_knowledge_graph
+
+
+# ---------------------------------------------------------------------------
+# read_config_yml
+# ---------------------------------------------------------------------------
+
+def test_read_config_yml_missing_file(tmp_path, monkeypatch):
+    """Returns empty dict when config/config.yml does not exist."""
+    monkeypatch.chdir(tmp_path)
+    result = read_config_yml()
+    assert result == {}
+
+
+def test_read_config_yml_valid_file(tmp_path, monkeypatch):
+    """Parses and returns dict from a valid YAML file."""
+    monkeypatch.chdir(tmp_path)
+    config_dir = tmp_path / "config"
+    config_dir.mkdir()
+    (config_dir / "config.yml").write_text(
+        "defaults:\n  llm: openai-llm\n  stt: stt-deepgram\n"
+    )
+    result = read_config_yml()
+    assert result["defaults"]["llm"] == "openai-llm"
+    assert result["defaults"]["stt"] == "stt-deepgram"
+
+
+def test_read_config_yml_empty_file(tmp_path, monkeypatch):
+    """Returns empty dict for an empty YAML file (yaml.safe_load returns None)."""
+    monkeypatch.chdir(tmp_path)
+    config_dir = tmp_path / "config"
+    config_dir.mkdir()
+    (config_dir / "config.yml").write_text("")
+    result = read_config_yml()
+    assert result == {}
+
+
+def test_read_config_yml_comment_only_file(tmp_path, monkeypatch):
+    """Returns empty dict when the file contains only YAML comments."""
+    monkeypatch.chdir(tmp_path)
+    config_dir = tmp_path / "config"
+    config_dir.mkdir()
+    (config_dir / "config.yml").write_text("# just a comment\n")
+    result = read_config_yml()
+    assert result == {}
+
+
+# ---------------------------------------------------------------------------
+# get_existing_stt_provider
+# ---------------------------------------------------------------------------
+
+@pytest.mark.parametrize("stt_value, expected", [
+    ("stt-deepgram", "deepgram"),
+    ("stt-deepgram-stream", "deepgram"),
+    ("stt-parakeet-batch", "parakeet"),
+    ("stt-vibevoice", "vibevoice"),
+    ("stt-qwen3-asr", "qwen3-asr"),
+    ("stt-smallest", "smallest"),
+    ("stt-smallest-stream", "smallest"),
+])
+def test_get_existing_stt_provider_known_values(stt_value, expected):
+    """Maps known config.yml stt values to wizard provider names."""
+    config = {"defaults": {"stt": stt_value}}
+    assert get_existing_stt_provider(config) == expected
+
+
+def test_get_existing_stt_provider_unknown_returns_none():
+    """Returns None for unknown stt values (e.g. custom providers)."""
+    config = {"defaults": {"stt": "stt-unknown-provider"}}
+    assert get_existing_stt_provider(config) is None
+
+
+def test_get_existing_stt_provider_missing_key():
+    """Returns None when defaults.stt key is absent."""
+    assert get_existing_stt_provider({}) is None
+    assert get_existing_stt_provider({"defaults": {}}) is None
+
+
+# ---------------------------------------------------------------------------
+# get_existing_stream_provider
+# ---------------------------------------------------------------------------
+
+@pytest.mark.parametrize("stt_stream_value, expected", [
+    ("stt-deepgram-stream", "deepgram"),
+    ("stt-smallest-stream", "smallest"),
+    ("stt-qwen3-asr", "qwen3-asr"),
+    ("stt-qwen3-asr-stream", "qwen3-asr"),
+])
+def test_get_existing_stream_provider_known_values(stt_stream_value, expected):
+    """Maps known config.yml stt_stream values to wizard streaming provider names."""
+    config = {"defaults": {"stt_stream": stt_stream_value}}
+    assert get_existing_stream_provider(config) == expected
+
+
+def test_get_existing_stream_provider_unknown_returns_none():
+    """Returns None for unknown stt_stream values."""
+    config = {"defaults": {"stt_stream": "stt-unknown"}}
+    assert get_existing_stream_provider(config) is None
+
+
+def test_get_existing_stream_provider_missing_key():
+    """Returns None when defaults.stt_stream is absent."""
+    assert get_existing_stream_provider({}) is None
+    assert get_existing_stream_provider({"defaults": {}}) is None
+
+
+# ---------------------------------------------------------------------------
+# select_llm_provider — test default resolution logic via EOFError path
+# ---------------------------------------------------------------------------
+
+def _select_llm_with_eof(config_yml):
+    """Drive select_llm_provider in non-interactive mode by injecting EOFError."""
+    with patch.object(_wizard, "Prompt") as mock_prompt:
+        mock_prompt.ask.side_effect = EOFError
+        return select_llm_provider(config_yml)
+
+
+def test_select_llm_provider_defaults_to_openai_when_no_config():
+    """Defaults to openai when config is empty."""
+    result = _select_llm_with_eof({})
+    assert result == "openai"
+
+
+def test_select_llm_provider_defaults_to_openai_for_openai_llm():
+    """Picks openai when existing config has defaults.llm = openai-llm."""
+    config = {"defaults": {"llm": "openai-llm"}}
+    result = _select_llm_with_eof(config)
+    assert result == "openai"
+
+
+def test_select_llm_provider_defaults_to_ollama_for_local_llm():
+    """Picks ollama when existing config has defaults.llm = local-llm."""
+    config = {"defaults": {"llm": "local-llm"}}
+    result = _select_llm_with_eof(config)
+    assert result == "ollama"
+
+
+def test_select_llm_provider_none_config():
+    """Treats None config_yml as empty dict (defaults to openai)."""
+    result = _select_llm_with_eof(None)
+    assert result == "openai"
+
+
+# ---------------------------------------------------------------------------
+# select_memory_provider — test default resolution logic via EOFError path
+# ---------------------------------------------------------------------------
+
+def _select_memory_with_eof(config_yml):
+    with patch.object(_wizard, "Prompt") as mock_prompt:
+        mock_prompt.ask.side_effect = EOFError
+        return select_memory_provider(config_yml)
+
+
+def test_select_memory_provider_defaults_to_chronicle_when_no_config():
+    """Defaults to chronicle when config is empty."""
+    result = _select_memory_with_eof({})
+    assert result == "chronicle"
+
+
+def test_select_memory_provider_defaults_to_chronicle():
+    """Picks chronicle when existing config has memory.provider = chronicle."""
+    config = {"memory": {"provider": "chronicle"}}
+    result = _select_memory_with_eof(config)
+    assert result == "chronicle"
+
+
+def test_select_memory_provider_defaults_to_openmemory_mcp():
+    """Picks openmemory_mcp when existing config has memory.provider = openmemory_mcp."""
+    config = {"memory": {"provider": "openmemory_mcp"}}
+    result = _select_memory_with_eof(config)
+    assert result == "openmemory_mcp"
+
+
+def test_select_memory_provider_none_config():
+    """Treats None config_yml as empty dict (defaults to chronicle)."""
+    result = _select_memory_with_eof(None)
+    assert result == "chronicle"
+
+
+# ---------------------------------------------------------------------------
+# select_knowledge_graph — test default resolution logic via EOFError path
+# ---------------------------------------------------------------------------
+
+def _select_kg_with_eof(config_yml):
+    with patch.object(_wizard, "Confirm") as mock_confirm:
+        mock_confirm.ask.side_effect = EOFError
+        return select_knowledge_graph(config_yml)
+
+
+def test_select_knowledge_graph_defaults_to_true_when_no_config():
+    """Defaults to True (enabled) when config is empty."""
+    result = _select_kg_with_eof({})
+    assert result is True
+
+
+def test_select_knowledge_graph_respects_existing_true():
+    """Returns True when existing config has knowledge_graph.enabled = True."""
+    config = {"memory": {"knowledge_graph": {"enabled": True}}}
+    result = _select_kg_with_eof(config)
+    assert result is True
+
+
+def test_select_knowledge_graph_respects_existing_false():
+    """Returns False when existing config has knowledge_graph.enabled = False."""
+    config = {"memory": {"knowledge_graph": {"enabled": False}}}
+    result = _select_kg_with_eof(config)
+    assert result is False
+
+
+def test_select_knowledge_graph_none_config():
+    """Treats None config_yml as empty dict (defaults to True)."""
+    result = _select_kg_with_eof(None)
+    assert result is True
diff --git a/wizard.py b/wizard.py
index b04f028c..7d341690 100755
--- a/wizard.py
+++ b/wizard.py
@@ -25,6 +25,48 @@
 
 console = Console()
 
+
+def read_config_yml() -> dict:
+    """Read config/config.yml and return parsed dict, or empty dict if not found.
+
+    Used to load existing configuration as defaults for wizard prompts so that
+    re-runs default to previously configured values.
+    """
+    config_path = Path("config/config.yml")
+    if not config_path.exists():
+        return {}
+    with open(config_path, 'r') as f:
+        result = yaml.safe_load(f)
+        return result if result else {}
+
+
+def get_existing_stt_provider(config_yml: dict):
+    """Map config.yml defaults.stt value back to wizard provider name, or None."""
+    stt = config_yml.get("defaults", {}).get("stt", "")
+    mapping = {
+        "stt-deepgram": "deepgram",
+        "stt-deepgram-stream": "deepgram",
+        "stt-parakeet-batch": "parakeet",
+        "stt-vibevoice": "vibevoice",
+        "stt-qwen3-asr": "qwen3-asr",
+        "stt-smallest": "smallest",
+        "stt-smallest-stream": "smallest",
+    }
+    return mapping.get(stt)
+
+
+def get_existing_stream_provider(config_yml: dict):
+    """Map config.yml defaults.stt_stream value back to wizard streaming provider name, or None."""
+    stt_stream = config_yml.get("defaults", {}).get("stt_stream", "")
+    mapping = {
+        "stt-deepgram-stream": "deepgram",
+        "stt-smallest-stream": "smallest",
+        "stt-qwen3-asr": "qwen3-asr",
+        "stt-qwen3-asr-stream": "qwen3-asr",
+    }
+    return mapping.get(stt_stream)
+
+
 SERVICES = {
     'backend': {
         'advanced': {
@@ -115,8 +157,9 @@ def check_service_exists(service_name, service_config):
 
     return True, "OK"
 
-def select_services(transcription_provider=None):
+def select_services(transcription_provider=None, config_yml=None, memory_provider=None):
     """Let user select which services to setup"""
+    config_yml = config_yml or {}
     console.print("🚀 [bold cyan]Chronicle Service Setup[/bold cyan]")
     console.print("Select which services to configure:\n")
 
@@ -151,8 +194,19 @@ def select_services(transcription_provider=None):
             console.print(f"  ⏸️  {service_config['description']} - [dim]{msg}[/dim]")
             continue
 
-        # Speaker recognition is recommended by default
-        default_enable = service_name == 'speaker-recognition'
+        # Determine smart default based on existing config
+        if service_name == 'speaker-recognition':
+            # Default to True if speaker-recognition .env exists and has a valid (non-placeholder) HF_TOKEN
+            speaker_env = 'extras/speaker-recognition/.env'
+            existing_hf = read_env_value(speaker_env, 'HF_TOKEN')
+            default_enable = bool(existing_hf and not is_placeholder(
+                existing_hf, 'your_huggingface_token_here', 'your-huggingface-token-here', 'hf_xxxxx'
+            ))
+        elif service_name == 'openmemory-mcp':
+            # Default to True if memory provider was selected as openmemory_mcp
+            default_enable = (memory_provider == "openmemory_mcp")
+        else:
+            default_enable = False
 
         try:
             enable_service = Confirm.ask(f"  Setup {service_config['description']}?", default=default_enable)
@@ -189,7 +243,8 @@ def run_service_setup(service_name, selected_services, https_enabled=False, serv
                      obsidian_enabled=False, neo4j_password=None, hf_token=None,
                      transcription_provider='deepgram', admin_email=None, admin_password=None,
                      langfuse_public_key=None, langfuse_secret_key=None, langfuse_host=None,
-                     streaming_provider=None):
+                     streaming_provider=None, llm_provider=None, memory_provider=None,
+                     knowledge_graph_enabled=None):
     """Execute individual service setup script"""
     if service_name == 'advanced':
         service = SERVICES['backend'][service_name]
@@ -217,9 +272,25 @@ def run_service_setup(service_name, selected_services, https_enabled=False, serv
         if neo4j_password:
             cmd.extend(['--neo4j-password', neo4j_password])
 
-        # Add Obsidian configuration
+        # Always pass obsidian choice to avoid double-ask
         if obsidian_enabled:
             cmd.extend(['--enable-obsidian'])
+        else:
+            cmd.extend(['--no-obsidian'])
+
+        # Always pass knowledge graph choice to avoid double-ask
+        if knowledge_graph_enabled is True:
+            cmd.extend(['--enable-knowledge-graph'])
+        elif knowledge_graph_enabled is False:
+            cmd.extend(['--no-knowledge-graph'])
+
+        # Pass LLM provider choice
+        if llm_provider:
+            cmd.extend(['--llm-provider', llm_provider])
+
+        # Pass memory provider choice
+        if memory_provider:
+            cmd.extend(['--memory-provider', memory_provider])
 
         # Pass LangFuse keys from langfuse init or external config
         if langfuse_public_key and langfuse_secret_key:
@@ -527,11 +598,27 @@ def setup_config_file():
 STREAMING_CAPABLE = {"deepgram", "smallest", "qwen3-asr"}
 
 
-def select_transcription_provider():
+def select_transcription_provider(config_yml: dict = None):
     """Ask user which transcription provider they want (batch/primary)."""
+    config_yml = config_yml or {}
+    existing_provider = get_existing_stt_provider(config_yml)
+
+    provider_to_choice = {
+        "deepgram": "1", "parakeet": "2", "vibevoice": "3",
+        "qwen3-asr": "4", "smallest": "5", "none": "6",
+    }
+    choice_to_provider = {v: k for k, v in provider_to_choice.items()}
+    default_choice = provider_to_choice.get(existing_provider, "1")
+
     console.print("\n🎤 [bold cyan]Transcription Provider[/bold cyan]")
     console.print("Choose your speech-to-text provider (used for [bold]batch[/bold]/high-quality transcription):")
     console.print("[dim]If it also supports streaming, it will be used for real-time too by default.[/dim]")
+    if existing_provider:
+        provider_labels = {
+            "deepgram": "Deepgram", "parakeet": "Parakeet ASR", "vibevoice": "VibeVoice ASR",
+            "qwen3-asr": "Qwen3-ASR", "smallest": "Smallest.ai Pulse",
+        }
+        console.print(f"[blue][INFO][/blue] Current: {provider_labels.get(existing_provider, existing_provider)}")
     console.print()
 
     choices = {
@@ -544,32 +631,22 @@ def select_transcription_provider():
     }
 
     for key, desc in choices.items():
-        console.print(f"  {key}) {desc}")
+        marker = " [dim](current)[/dim]" if key == default_choice else ""
+        console.print(f"  {key}) {desc}{marker}")
     console.print()
 
     while True:
         try:
-            choice = Prompt.ask("Enter choice", default="1")
+            choice = Prompt.ask("Enter choice", default=default_choice)
             if choice in choices:
-                if choice == "1":
-                    return "deepgram"
-                elif choice == "2":
-                    return "parakeet"
-                elif choice == "3":
-                    return "vibevoice"
-                elif choice == "4":
-                    return "qwen3-asr"
-                elif choice == "5":
-                    return "smallest"
-                elif choice == "6":
-                    return "none"
+                return choice_to_provider[choice]
             console.print(f"[red]Invalid choice. Please select from {list(choices.keys())}[/red]")
         except EOFError:
-            console.print("Using default: Deepgram")
-            return "deepgram"
+            console.print(f"Using default: {choices.get(default_choice, 'Deepgram')}")
+            return choice_to_provider.get(default_choice, "deepgram")
 
 
-def select_streaming_provider(batch_provider):
+def select_streaming_provider(batch_provider, config_yml: dict = None):
     """Ask if user wants a different provider for real-time streaming.
 
     If the batch provider supports streaming, offer to use the same (saves a step).
@@ -578,15 +655,20 @@ def select_streaming_provider(batch_provider):
     Returns:
         Streaming provider name if different from batch, or None (same / skipped).
     """
+    config_yml = config_yml or {}
     if batch_provider in ("none", None):
         return None
 
+    existing_stream = get_existing_stream_provider(config_yml)
+
     if batch_provider in STREAMING_CAPABLE:
         # Batch provider can already stream — just confirm
+        # Default to "use different" if a different streaming provider was previously configured
+        has_different_stream = bool(existing_stream and existing_stream != batch_provider)
         console.print(f"\n🔊 [bold cyan]Streaming[/bold cyan]")
         console.print(f"{batch_provider} supports both batch and streaming.")
         try:
-            use_different = Confirm.ask("Use a different provider for real-time streaming?", default=False)
+            use_different = Confirm.ask("Use a different provider for real-time streaming?", default=has_different_stream)
         except EOFError:
             return None
         if not use_different:
@@ -615,13 +697,22 @@ def select_streaming_provider(batch_provider):
     streaming_choices[skip_key] = "Skip (no real-time streaming)"
     provider_map[skip_key] = None
 
+    # Pre-select the default based on existing config
+    default_stream_choice = "1"
+    if existing_stream and existing_stream != batch_provider:
+        for k, v in provider_map.items():
+            if v == existing_stream:
+                default_stream_choice = k
+                break
+
     for key, desc in streaming_choices.items():
-        console.print(f"  {key}) {desc}")
+        marker = " [dim](current)[/dim]" if key == default_stream_choice else ""
+        console.print(f"  {key}) {desc}{marker}")
     console.print()
 
     while True:
         try:
-            choice = Prompt.ask("Enter choice", default="1")
+            choice = Prompt.ask("Enter choice", default=default_stream_choice)
             if choice in streaming_choices:
                 result = provider_map[choice]
                 if result:
@@ -713,6 +804,103 @@ def setup_langfuse_choice():
     }
 
 
+def select_llm_provider(config_yml: dict = None) -> str:
+    """Ask user which LLM provider to use for memory extraction.
+
+    Returns:
+        "openai", "ollama", or "none"
+    """
+    config_yml = config_yml or {}
+    existing_llm = config_yml.get("defaults", {}).get("llm", "")
+    llm_to_choice = {"openai-llm": "1", "local-llm": "2"}
+    default_choice = llm_to_choice.get(existing_llm, "1")
+
+    console.print("\n🤖 [bold cyan]LLM Provider[/bold cyan]")
+    console.print("Choose your language model provider for memory extraction and analysis:")
+    console.print()
+
+    choices = {
+        "1": "OpenAI (GPT-4o-mini, requires API key)",
+        "2": "Ollama (local models, runs on your machine)",
+        "3": "None (skip memory extraction)",
+    }
+
+    for key, desc in choices.items():
+        marker = " [dim](current)[/dim]" if key == default_choice else ""
+        console.print(f"  {key}) {desc}{marker}")
+    console.print()
+
+    while True:
+        try:
+            choice = Prompt.ask("Enter choice", default=default_choice)
+            if choice in choices:
+                return {"1": "openai", "2": "ollama", "3": "none"}[choice]
+            console.print(f"[red]Invalid choice. Please select from {list(choices.keys())}[/red]")
+        except EOFError:
+            console.print(f"Using default: {choices.get(default_choice, 'OpenAI')}")
+            return {"1": "openai", "2": "ollama", "3": "none"}.get(default_choice, "openai")
+
+
+def select_memory_provider(config_yml: dict = None) -> str:
+    """Ask user which memory storage backend to use.
+
+    This is separate from the 'Setup OpenMemory MCP server?' service question.
+    That question is about running the extra service; this is about the backend provider.
+
+    Returns:
+        "chronicle" or "openmemory_mcp"
+    """
+    config_yml = config_yml or {}
+    existing_provider = config_yml.get("memory", {}).get("provider", "chronicle")
+    default_choice = "2" if existing_provider == "openmemory_mcp" else "1"
+
+    console.print("\n🧠 [bold cyan]Memory Storage Backend[/bold cyan]")
+    console.print("Choose where your memories and conversation facts are stored:")
+    console.print()
+
+    choices = {
+        "1": "Chronicle Native (Qdrant vector database, self-hosted)",
+        "2": "OpenMemory MCP (cross-client compatible, requires openmemory-mcp service)",
+    }
+
+    for key, desc in choices.items():
+        marker = " [dim](current)[/dim]" if key == default_choice else ""
+        console.print(f"  {key}) {desc}{marker}")
+    console.print()
+
+    while True:
+        try:
+            choice = Prompt.ask("Enter choice", default=default_choice)
+            if choice in choices:
+                return {"1": "chronicle", "2": "openmemory_mcp"}[choice]
+            console.print(f"[red]Invalid choice. Please select from {list(choices.keys())}[/red]")
+        except EOFError:
+            return {"1": "chronicle", "2": "openmemory_mcp"}.get(default_choice, "chronicle")
+
+
+def select_knowledge_graph(config_yml: dict = None) -> bool:
+    """Ask user if Knowledge Graph should be enabled.
+
+    Returns:
+        True if Knowledge Graph should be enabled, False otherwise.
+    """
+    config_yml = config_yml or {}
+    existing_enabled = config_yml.get("memory", {}).get("knowledge_graph", {}).get("enabled", True)
+
+    console.print("\n🕸️  [bold cyan]Knowledge Graph[/bold cyan]")
+    console.print("Extracts people, places, organizations, events, and tasks from conversations")
+    console.print("Uses Neo4j (included in the stack)")
+    console.print()
+
+    try:
+        enabled = Confirm.ask("Enable Knowledge Graph?", default=existing_enabled)
+    except EOFError:
+        console.print(f"Using default: {'Yes' if existing_enabled else 'No'}")
+        enabled = existing_enabled
+
+    return enabled
+
+
 def main():
     """Main orchestration logic"""
     console.print("🎉 [bold green]Welcome to Chronicle![/bold green]\n")
@@ -729,14 +917,23 @@ def main():
     # Show what's available
     show_service_status()
 
+    # Read existing config.yml once — used as defaults for ALL wizard questions below
+    config_yml = read_config_yml()
+
     # Ask about transcription provider FIRST (determines which services are needed)
-    transcription_provider = select_transcription_provider()
+    transcription_provider = select_transcription_provider(config_yml)
 
     # Ask about streaming provider (if batch provider doesn't stream, or user wants a different one)
-    streaming_provider = select_streaming_provider(transcription_provider)
+    streaming_provider = select_streaming_provider(transcription_provider, config_yml)
+
+    # LLM Provider selection (asked once here, passed to init.py — avoids double-ask)
+    llm_provider = select_llm_provider(config_yml)
+
+    # Memory Provider selection (asked once here, passed to init.py — avoids double-ask)
+    memory_provider = select_memory_provider(config_yml)
 
     # Service Selection (pass transcription_provider so we skip asking about ASR when already chosen)
-    selected_services = select_services(transcription_provider)
+    selected_services = select_services(transcription_provider, config_yml, memory_provider)
 
     # Auto-add asr-services if any local ASR was chosen (batch or streaming)
     local_asr_providers = ("parakeet", "vibevoice", "qwen3-asr")
@@ -746,6 +943,13 @@ def main():
         console.print(f"[blue][INFO][/blue] Auto-adding ASR services for {reason} transcription")
         selected_services.append('asr-services')
 
+    # Auto-add openmemory-mcp service if openmemory_mcp was selected as memory provider
+    if memory_provider == "openmemory_mcp" and 'openmemory-mcp' not in selected_services:
+        exists, _ = check_service_exists('openmemory-mcp', SERVICES['extras']['openmemory-mcp'])
+        if exists:
+            console.print("[blue][INFO][/blue] Memory provider is OpenMemory MCP — auto-adding openmemory-mcp service")
+            selected_services.append('openmemory-mcp')
+
     if not selected_services:
         console.print("\n[yellow]No services selected. Exiting.[/yellow]")
         return
@@ -770,11 +974,15 @@ def main():
         console.print("\n🔒 [bold cyan]HTTPS Configuration[/bold cyan]")
         console.print("HTTPS enables microphone access in browsers and secure connections")
 
+        # Default to existing HTTPS_ENABLED setting
+        existing_https = read_env_value('backends/advanced/.env', 'HTTPS_ENABLED')
+        default_https = existing_https == "true"
+
         try:
-            https_enabled = Confirm.ask("Enable HTTPS for selected services?", default=False)
+            https_enabled = Confirm.ask("Enable HTTPS for selected services?", default=default_https)
         except EOFError:
-            console.print("Using default: No")
-            https_enabled = False
+            console.print(f"Using default: {'Yes' if default_https else 'No'}")
+            https_enabled = default_https
 
         if https_enabled:
             # Try to auto-detect Tailscale address
@@ -816,19 +1024,22 @@ def main():
     # Neo4j Configuration (always required - used by Knowledge Graph)
     neo4j_password = None
     obsidian_enabled = False
+    knowledge_graph_enabled = None
 
     if 'advanced' in selected_services:
         console.print("\n🗄️ [bold cyan]Neo4j Configuration[/bold cyan]")
         console.print("Neo4j is used for Knowledge Graph (entity/relationship extraction from conversations)")
         console.print()
 
-        # Always prompt for Neo4j password (masked input)
-        try:
-            console.print("Neo4j password (min 8 chars) [leave empty for default: neo4jpassword]")
-            neo4j_password = prompt_password("Neo4j password", min_length=8)
-        except (EOFError, KeyboardInterrupt):
-            neo4j_password = "neo4jpassword"
-            console.print("Using default password")
+        # Read existing Neo4j password and use as default (masked prompt)
+        existing_neo4j_pw = read_env_value('backends/advanced/.env', 'NEO4J_PASSWORD')
+        neo4j_password = prompt_with_existing_masked(
+            prompt_text="Neo4j password (min 8 chars)",
+            existing_value=existing_neo4j_pw,
+            placeholders=['neo4jpassword', 'your_neo4j_password', 'your-neo4j-password'],
+            is_password=True,
+            default="neo4jpassword"
+        )
         if not neo4j_password:
             neo4j_password = "neo4jpassword"
 
@@ -839,15 +1050,20 @@ def main():
         console.print("Enable graph-based knowledge management for Obsidian vault notes")
         console.print()
 
+        # Load existing obsidian enabled state from config.yml as default
+        existing_obsidian = config_yml.get("memory", {}).get("obsidian", {}).get("enabled", False)
         try:
-            obsidian_enabled = Confirm.ask("Enable Obsidian integration?", default=False)
+            obsidian_enabled = Confirm.ask("Enable Obsidian integration?", default=existing_obsidian)
         except EOFError:
-            console.print("Using default: No")
-            obsidian_enabled = False
+            console.print(f"Using default: {'Yes' if existing_obsidian else 'No'}")
+            obsidian_enabled = existing_obsidian
 
         if obsidian_enabled:
             console.print("[green]✅[/green] Obsidian integration will be configured")
 
+        # Knowledge Graph configuration (asked here once, passed to init.py)
+        knowledge_graph_enabled = select_knowledge_graph(config_yml)
+
     # Pure Delegation - Run Each Service Setup
     console.print(f"\n📋 [bold]Setting up {len(selected_services)} services...[/bold]")
 
@@ -882,7 +1098,9 @@ def main():
                             obsidian_enabled, neo4j_password, hf_token, transcription_provider,
                             admin_email=wizard_admin_email, admin_password=wizard_admin_password,
                             langfuse_public_key=langfuse_public_key, langfuse_secret_key=langfuse_secret_key,
-                            langfuse_host=langfuse_host, streaming_provider=streaming_provider):
+                            langfuse_host=langfuse_host, streaming_provider=streaming_provider,
+                            llm_provider=llm_provider, memory_provider=memory_provider,
+                            knowledge_graph_enabled=knowledge_graph_enabled):
             success_count += 1
 
             # After local langfuse setup, read generated API keys for backend