Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 17 additions & 7 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -324,9 +324,10 @@ devs start eamonn --env DEBUG=false --env NEW_VAR=test
**Container Pool**:
- `CONTAINER_POOL`: Comma-separated container names for Claude tasks (default: eamonn,harry,darren)
- `CI_CONTAINER_POOL`: Optional comma-separated container names for CI/test tasks only. If not specified, CI tasks use the main `CONTAINER_POOL`. If specified, the main `CONTAINER_POOL` is used only for Claude tasks, and this pool is used exclusively for tests. The pools can overlap (share container names) if desired.
- `CONTAINER_TIMEOUT_MINUTES`: Idle timeout for containers in minutes (default: 60)
- `CONTAINER_MAX_AGE_HOURS`: Maximum container age in hours - containers older than this are cleaned up when idle (default: 10)
- `CLEANUP_CHECK_INTERVAL_SECONDS`: How often to check for idle/old containers (default: 60)
- `STOP_CONTAINER_AFTER_TASK`: Stop container after each task completes (default: true). This ensures only one running container per dev name at any time, reducing RAM usage when multiple repos are in play.
- `CONTAINER_TIMEOUT_MINUTES`: Idle timeout for containers in minutes (default: 60). Only applies when `STOP_CONTAINER_AFTER_TASK` is false.
- `CONTAINER_MAX_AGE_HOURS`: Maximum container age in hours - containers older than this are cleaned up when idle (default: 10). Only applies when `STOP_CONTAINER_AFTER_TASK` is false.
- `CLEANUP_CHECK_INTERVAL_SECONDS`: How often to check for idle/old containers (default: 60). Only applies when `STOP_CONTAINER_AFTER_TASK` is false.
- `MAX_CONCURRENT_TASKS`: Maximum parallel tasks (default: 3)

**Access Control**:
Expand Down Expand Up @@ -487,14 +488,22 @@ devs-webhook-worker --container-name eamonn --task-json-stdin < task.json

### Container Lifecycle Management

Containers are automatically managed with cleanup based on idle time and age:
By default, containers are stopped immediately after each task completes (`STOP_CONTAINER_AFTER_TASK=true`). This ensures only one running container per dev name (queue) at any time, significantly reducing RAM usage when multiple repositories are in play.

1. **Idle Cleanup**: Containers idle for longer than `CONTAINER_TIMEOUT_MINUTES` (default: 60 min) are stopped and cleaned up
2. **Age-Based Cleanup**: Containers older than `CONTAINER_MAX_AGE_HOURS` (default: 10 hours) are cleaned up when they become idle
**Default Behavior (stop after task)**:
- Container is started when a task begins
- Container is stopped and cleaned up immediately after the task completes
- Next task on the same queue starts a fresh container
- Only one running container per dev name at any time

**Legacy Behavior** (`STOP_CONTAINER_AFTER_TASK=false`):
Containers remain running and are cleaned up based on idle time and age:
1. **Idle Cleanup**: Containers idle for longer than `CONTAINER_TIMEOUT_MINUTES` (default: 60 min) are stopped
2. **Age-Based Cleanup**: Containers older than `CONTAINER_MAX_AGE_HOURS` (default: 10 hours) are cleaned up when idle
3. **Graceful Shutdown**: On server shutdown (SIGTERM/SIGINT), all running containers are cleaned up
4. **Manual Stop**: Admin can force-stop containers via `POST /container/{name}/stop` endpoint

**Key behaviors**:
**Legacy key behaviors** (only when `STOP_CONTAINER_AFTER_TASK=false`):
- Containers currently processing tasks are never interrupted by age-based cleanup
- Age-based cleanup only triggers when a container is idle (not actively processing)
- The cleanup check runs every `CLEANUP_CHECK_INTERVAL_SECONDS` (default: 60 seconds)
Expand All @@ -504,6 +513,7 @@ Containers are automatically managed with cleanup based on idle time and age:
- `last_used`: Last task completion time
- `age_hours`: How long container has been running
- `idle_minutes`: How long since last task completed
- `stop_container_after_task`: Whether containers are stopped after each task

**Burst Mode Considerations**:
In SQS burst mode (`--burst`), the background cleanup worker is disabled since:
Expand Down
42 changes: 24 additions & 18 deletions packages/common/devs_common/core/container.py
Original file line number Diff line number Diff line change
Expand Up @@ -382,47 +382,53 @@ def ensure_container_running(

raise ContainerError(f"Failed to ensure container running for {dev_name}: {e}")

def stop_container(self, dev_name: str) -> bool:
"""Stop and remove a container by labels (more reliable than names).
def stop_container(self, dev_name: str, remove: bool = True) -> bool:
"""Stop a container by labels, optionally removing it.

Args:
dev_name: Development environment name

remove: If True (default), also remove the container after stopping.
If False, only stop the container (it can be restarted later).

Returns:
True if container was stopped/removed
True if container was stopped (and removed if requested)
"""
project_labels = self._get_project_labels(dev_name)

try:
console.print(f" 🔍 Looking for containers with labels: {project_labels}")
existing_containers = self.docker.find_containers_by_labels(project_labels)
console.print(f" 📋 Found {len(existing_containers)} containers")

if existing_containers:
for container_info in existing_containers:
container_name = container_info['name']
container_status = container_info['status']

console.print(f" 🛑 Stopping container: {container_name} (status: {container_status})")
try:
stop_result = self.docker.stop_container(container_name)
console.print(f" 📋 Stop result: {stop_result}")
except DockerError as stop_e:
console.print(f" ⚠️ Stop failed for {container_name}: {stop_e}")

console.print(f" 🗑️ Removing container: {container_name}")
try:
remove_result = self.docker.remove_container(container_name)
console.print(f" 📋 Remove result: {remove_result}")
except DockerError as remove_e:
console.print(f" ⚠️ Remove failed for {container_name}: {remove_e}")

console.print(f" ✅ Stopped and removed: {dev_name}")

if remove:
console.print(f" 🗑️ Removing container: {container_name}")
try:
remove_result = self.docker.remove_container(container_name)
console.print(f" 📋 Remove result: {remove_result}")
except DockerError as remove_e:
console.print(f" ⚠️ Remove failed for {container_name}: {remove_e}")

if remove:
console.print(f" ✅ Stopped and removed: {dev_name}")
else:
console.print(f" ✅ Stopped: {dev_name}")
return True
else:
console.print(f" ⚠️ No containers found for {dev_name}")
return False

except DockerError as e:
console.print(f" ❌ Error stopping {dev_name}: {e}")
return False
Expand Down
42 changes: 33 additions & 9 deletions packages/common/devs_common/core/workspace.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
safe_remove_directory,
is_directory_empty
)
from ..utils.git_utils import get_tracked_files, is_devcontainer_gitignored
from ..utils.git_utils import get_tracked_files, is_devcontainer_gitignored, reset_git_state
from ..utils.devcontainer_template import get_template_dir
from ..utils.console import get_console
from .project import Project
Expand Down Expand Up @@ -417,23 +417,29 @@ def cleanup_unused_workspaces_all_projects(self, docker_client) -> int:
console.print(f"❌ Error during cross-project workspace cleanup: {e}")
return 0

def sync_workspace(self, dev_name: str, files_to_sync: Optional[List[str]] = None) -> bool:
def sync_workspace(self, dev_name: str, files_to_sync: Optional[List[str]] = None, clean_untracked: bool = True) -> bool:
"""Sync specific files from project to workspace.

Args:
dev_name: Development environment name
files_to_sync: List of files to sync, or None for git-tracked files

clean_untracked: If True (default), remove untracked files/dirs from workspace
before syncing. Important when reusing workspaces between tasks.

Returns:
True if sync was successful
"""
workspace_dir = self.get_workspace_dir(dev_name)

if not self.workspace_exists(dev_name):
console.print(f" ❌ Workspace for {dev_name} does not exist")
return False

try:
# Clean up workspace git state before syncing (important for reused workspaces)
if clean_untracked and self.project.info.is_git_repo:
self._reset_workspace_git_state(workspace_dir)

if files_to_sync is None:
# Sync git-tracked files
if self.project.info.is_git_repo:
Expand All @@ -456,10 +462,28 @@ def sync_workspace(self, dev_name: str, files_to_sync: Optional[List[str]] = Non
file_list=file_paths,
preserve_permissions=True
)

console.print(f" ✅ Synced workspace for {dev_name}")
return True

except Exception as e:
console.print(f" ❌ Failed to sync workspace for {dev_name}: {e}")
return False
return False

def _reset_workspace_git_state(self, workspace_dir: Path) -> None:
"""Reset workspace git state to clean state.

Removes untracked files and resets any uncommitted changes.
Important for reusing workspaces between tasks.

Args:
workspace_dir: Path to workspace directory
"""
git_dir = workspace_dir / ".git"
if not git_dir.exists():
return

if reset_git_state(workspace_dir):
console.print(f" 🧹 Reset workspace git state")
else:
console.print(f" ⚠️ Could not reset workspace git state")
36 changes: 36 additions & 0 deletions packages/common/devs_common/utils/git_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,42 @@ def get_git_root(directory: Path) -> Optional[Path]:
return None


def reset_git_state(repo_dir: Path, checkout_branch: Optional[str] = None) -> bool:
"""Reset git repository to clean state.

Discards uncommitted changes and removes untracked files.
Optionally checks out a specific branch.

Args:
repo_dir: Repository directory path
checkout_branch: Optional branch to checkout (with -f to discard changes)

Returns:
True if reset was successful, False otherwise
"""
try:
repo = Repo(repo_dir)

if checkout_branch:
# Force checkout branch (discards local modifications)
try:
repo.git.checkout('-f', checkout_branch)
except GitCommandError:
# Branch might not exist, that's OK
return False
else:
# Reset to HEAD (discard uncommitted changes to tracked files)
repo.git.reset('--hard', 'HEAD')

# Remove untracked files and directories (but not ignored files)
repo.git.clean('-fd')

return True

except (InvalidGitRepositoryError, GitCommandError) as e:
return False


def is_devcontainer_gitignored(repo_dir: Path) -> bool:
"""Check if .devcontainer/ folder is gitignored in the repository.

Expand Down
6 changes: 6 additions & 0 deletions packages/webhook/devs_webhook/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,12 @@ def __init__(self, **kwargs):
default=60,
description="How often to check for idle/old containers (in seconds)"
)
stop_container_after_task: bool = Field(
default=True,
description="Stop container after each task completes. "
"This ensures only one running container per dev name at any time, "
"reducing RAM usage when multiple repos are in play."
)
max_concurrent_tasks: int = Field(default=3, description="Maximum concurrent tasks")

# Repository settings
Expand Down
Loading