From 8f88790b20928805f0ee3714c06f750257b200d4 Mon Sep 17 00:00:00 2001 From: Arthur Bied-Charreton <136271426+winstonallo@users.noreply.github.com> Date: Thu, 5 Jun 2025 08:01:08 +0200 Subject: [PATCH 1/3] Update README.md --- README.md | 91 ++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 64 insertions(+), 27 deletions(-) diff --git a/README.md b/README.md index e24e8f1..310f90a 100644 --- a/README.md +++ b/README.md @@ -1,33 +1,70 @@ # taskmaster -Job control daemon inspired by [supervisord](https://supervisord.org/index.html). +A job control daemon, inspired by [supervisord](https://supervisord.org/index.html). Taskmaster manages background processes with configurable restart policies, healthchecks, and real-time monitoring. -## Goal -The idea is to build a daemon, configurable to manage background jobs reliably and with customizable options. +## Overview +Taskmaster consists of three main components: +* **taskmaster** - The backend managing processes +* **taskshell** - Interactive shell for sending commands to the backend +* **taskboard** - Real-time TUI for monitoring process status -Its key components are: -- `taskmaster`, daemon managing the jobs. -- `taskshell`, shell communicating commands to the daemon via UNIX sockets. +The daemon reads TOML configuration files to spawn and monitor processes, automatically restarting them based on defined policies. +Communication happens through Unix domain sockets using an adapted version of JSON-RPC 2.0. -## Challenges -### State Management -The first challenge was managing the state of the processes efficiently. The possible states of processes can be broken down to the following: -```rust -pub enum ProcessState { - Idle, - // Started attempt at <...> - HealthCheck(time::Instant), - Healthy, - // Previous state: <...> - Failed(Box), - // Retry at <...> - WaitingForRetry(time::Instant), - Completed, - Stopped, -} -``` ---- -The states and their transition triggers can be represented as follows: +## Features +* **Process Management** - Start, stop, restart processes with configurable retry policies +* **Health Checks** - Determine whether a process is healthy based on uptime, or a configured command (like in docker compose) +* **Real-time Communication** - Reliable Inter Process Communication +* **Hot-Reload** - Update process configurations without restarting the daemon +* **Process Attachment** - Stream stdout/stderr from running processes in real-time +* **Privilege Deescalation** - Deescalate into a different user when spawning processes +* **JSON Logs** - taskmaster logs are easy to look up by process name, event type, log level, ... +## Architecture +The core of taskmaster is a finite state machine that models process lifecycles. Each process moves through well-defined states with clear transition rules. ![alt text](assets/state_diagram.png) ---- -This lays out a rough process for decision making during daemon execution. We can easily define those states and their transitioning rules in code. + +### Dual State Processing +The state machine operates on two dimensions: +1. **Monitor States** - React to external events (process exits, timeouts, health check results) +2. **Desired States** - Handle user commands and policy decisions +This separation allows to handle complex scenarios like a user requesting a restart while a process is failing health checks by applying the same rules as for monitoring, making the state machine fully self-contained. +This ensures we processes cannot enter invalid states and proviedes predictable behavior. + +## Example Configuration +```toml +[processes.nginx] +cmd = "/usr/sbin/nginx" +user = "www" # Deescalate into www user +workingdir = "/var/www" +autostart = true # Spawn process automatically when taskmaster is started +autorestart = "on-failure[:5]" # Retry 5 times before giving up +stdout = "/var/log/nginx.stdout" +stderr = "/var/log/nginx.stderr" + +[processes.nginx.healthcheck] +cmd = "/usr/bin/curl" +args = ["http://localhost/health"] +timeout = 5 # Wait 5 seconds for one healthcheck before considering it failed +retries = 3 # Retry health check 5 times +``` +## Usage +Start the daemon +```bash +$ cargo ts engine start config.toml +``` +Send commands to the daemon using the taskshell, either interactively: +```bash +$ cargo ts +taskshell> status nginx +nginx: healthcheck since 2 seconds +taskshell> stop nginx +stopping nginx +``` +or with shell commands: +```bash +$ cargo ts status nginx +nginx: stopping since 3 seconds +$ cargo ts restart nginx +restarting nginx +``` +For a full explanation of the availables commands, run `cargo ts help`. From afc47b91ea258755be42005c459ce00a9e8ed6cf Mon Sep 17 00:00:00 2001 From: Arthur Bied-Charreton <136271426+winstonallo@users.noreply.github.com> Date: Thu, 5 Jun 2025 08:02:21 +0200 Subject: [PATCH 2/3] Fix spelling mistake in README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 310f90a..96cb54e 100644 --- a/README.md +++ b/README.md @@ -28,7 +28,7 @@ The state machine operates on two dimensions: 1. **Monitor States** - React to external events (process exits, timeouts, health check results) 2. **Desired States** - Handle user commands and policy decisions This separation allows to handle complex scenarios like a user requesting a restart while a process is failing health checks by applying the same rules as for monitoring, making the state machine fully self-contained. -This ensures we processes cannot enter invalid states and proviedes predictable behavior. +This ensures we processes cannot enter invalid states and provides predictable behavior. ## Example Configuration ```toml From 795b60141310e44f7bae5d6a835803d2d1c658b3 Mon Sep 17 00:00:00 2001 From: Arthur Bied-Charreton <136271426+winstonallo@users.noreply.github.com> Date: Thu, 5 Jun 2025 08:04:12 +0200 Subject: [PATCH 3/3] Update README.md --- README.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 96cb54e..94d58f6 100644 --- a/README.md +++ b/README.md @@ -26,9 +26,10 @@ The core of taskmaster is a finite state machine that models process lifecycles. ### Dual State Processing The state machine operates on two dimensions: 1. **Monitor States** - React to external events (process exits, timeouts, health check results) -2. **Desired States** - Handle user commands and policy decisions -This separation allows to handle complex scenarios like a user requesting a restart while a process is failing health checks by applying the same rules as for monitoring, making the state machine fully self-contained. -This ensures we processes cannot enter invalid states and provides predictable behavior. +2. **Desired States** - Handle user commands and policy decisions. + +The separation allows to handle complex scenarios like a user requesting a restart while a process is failing health checks by applying the same rules as for monitoring, making the state machine fully self-contained. +This makes sure processes cannot enter invalid states and provides predictable behavior. ## Example Configuration ```toml