-
Notifications
You must be signed in to change notification settings - Fork 312
feat: add structured logging system for agent debugging #113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: add structured logging system for agent debugging #113
Conversation
Add a comprehensive structured logging system that saves logs to SQLite
at {project_dir}/.autocoder/logs.db for debugging agent issues.
New files:
- structured_logging.py: SQLite-based logging with WAL mode for concurrency
- test_structured_logging.py: 28 unit tests covering all functionality
Integration points:
- agent.py: Log session lifecycle, tool use, errors, rate limits
- parallel_orchestrator.py: Log agent spawn/complete, failures, retries
- autonomous_agent_demo.py: Log entry point, interruptions, fatal errors
- client.py: Log configuration and API overrides
Features:
- WAL mode for concurrent access by parallel agents
- Query API with filtering by level, agent_id, feature_id, tool_name
- Full-text search in messages
- Export to JSON, JSONL, CSV formats
- Automatic cleanup of old entries
- Thread-safe concurrent writes
Access logs via:
- SQLite: {project}/.autocoder/logs.db
- API: GET /api/logs/{project_name}
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughAdds a new SQLite-backed structured logging subsystem and integrates it into agent runtime, demo entrypoint, client creation, and the parallel orchestrator; includes comprehensive unit tests and propagates logger/context across session lifecycle and error paths. Changes
Sequence DiagramsequenceDiagram
participant Agent as Agent Workflow
participant Demo as Demo/Orchestrator/Client
participant Logger as StructuredLogger
participant Handler as StructuredLogHandler
participant DB as SQLite DB
participant Query as LogQuery
Demo->>Logger: info(startup, project_dir, agent_id)
Agent->>Logger: info(session_start, feature_id, client_agent_id)
Logger->>Handler: emit(LogRecord)
Handler->>Handler: build StructuredLogEntry
Handler->>DB: INSERT INTO logs (...)
DB-->>Handler: commit (WAL)
Handler->>Handler: enforce max_entries (DELETE oldest)
Agent->>Logger: error(exception_type, message, extra)
Logger->>Handler: emit(LogRecord)
Query->>DB: SELECT ... (filters/paging/timeline)
DB-->>Query: rows
Query-->>Demo: results/export
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@structured_logging.py`:
- Around line 436-439: Replace the two uses of datetime.utcnow() that set
default ranges for since/until with timezone-aware calls: use
datetime.now(timezone.utc) - timedelta(hours=24) for since and
datetime.now(timezone.utc) for until; update or ensure the module imports
timezone from datetime (consistent with the rest of the file) so these
assignments in the block that sets defaults for since and until become
timezone-aware and match the existing pattern used elsewhere (see the existing
usage at line 147).
- Around line 360-362: The LIKE pattern uses the raw search string which allows
% and _ to act as wildcards; escape those characters before building the
parameter and include an ESCAPE clause in the SQL. Locate the query-building
branch that appends "message LIKE ?" and the params.append(f"%{search}%") (the
code handling the search variable), replace the raw insertion with an escaped
version (e.g., replace "%" and "_" with escaped forms and also escape your
escape char), append the escaped pattern wrapped with % to params, and update
the SQL fragment to use "message LIKE ? ESCAPE '\\'" so the DB treats the escape
sequences literally.
In `@test_structured_logging.py`:
- Around line 151-159: The test_log_warn currently asserts the presence of
"warning" in the logged message which is fragile; instead, change the assertion
to verify the log entry's level field. After calling logger.warn("Test warning")
and retrieving logs via get_log_query(...).query(level="warning"), assert that
logs[0]["level"] == "warning" (or use self.assertEqual on logs[0]["level"],
"warning") so the test validates log level rather than message content; keep
references to test_log_warn, logger.warn, get_log_query, and query.query when
locating and updating the assertion.
🧹 Nitpick comments (3)
test_structured_logging.py (1)
385-422: Thread safety test may be flaky on fast machines.The
test_concurrent_read_writetest has a race condition where the writer thread could complete before the reader thread has a chance to read any results. On a fast machine or under load,read_resultscould be empty if the writer finishes and setswrite_donebefore the reader enters its loop.Consider adding a small synchronization to ensure the reader starts before the writer completes:
def test_concurrent_read_write(self): """Test that reads and writes can happen concurrently.""" logger = get_logger(self.project_dir, agent_id="writer", console_output=False) query = get_log_query(self.project_dir) # Pre-populate some logs for i in range(10): logger.info(f"Initial log {i}") read_results = [] write_done = threading.Event() + reader_started = threading.Event() def writer(): + reader_started.wait(timeout=5) # Wait for reader to start for i in range(50): logger.info(f"Concurrent log {i}") write_done.set() def reader(): + reader_started.set() while not write_done.is_set(): count = query.count() read_results.append(count)structured_logging.py (1)
157-194: Connection-per-emit pattern is safe but inefficient.Opening and closing a SQLite connection on every
emit()call introduces overhead. For high-volume logging, this could become a performance bottleneck. However, this is a reasonable trade-off for simplicity and thread safety with WAL mode.If performance becomes an issue in the future, consider connection pooling or a dedicated writer thread with a queue.
autonomous_agent_demo.py (1)
240-246: Consider logging at error level for fatal errors.The fatal error is logged with
logger.error()which is correct, but the message truncation to 200 characters may lose important context for debugging complex errors.For fatal errors that terminate the script, consider logging the full traceback:
except Exception as e: print(f"\nFatal error: {e}") - logger.error("Fatal error", error_type=type(e).__name__, message=str(e)[:200]) + import traceback + logger.error("Fatal error", error_type=type(e).__name__, message=str(e)[:500], traceback=traceback.format_exc()[:1000]) raise
structured_logging.py
Outdated
| if not since: | ||
| since = datetime.utcnow() - timedelta(hours=24) | ||
| if not until: | ||
| until = datetime.utcnow() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use datetime.now(timezone.utc) instead of deprecated datetime.utcnow().
datetime.utcnow() is deprecated since Python 3.12 and will be removed in future versions. The rest of this file already uses the recommended datetime.now(timezone.utc) pattern (line 147).
# Default to last 24 hours
if not since:
- since = datetime.utcnow() - timedelta(hours=24)
+ since = datetime.now(timezone.utc) - timedelta(hours=24)
if not until:
- until = datetime.utcnow()
+ until = datetime.now(timezone.utc)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if not since: | |
| since = datetime.utcnow() - timedelta(hours=24) | |
| if not until: | |
| until = datetime.utcnow() | |
| if not since: | |
| since = datetime.now(timezone.utc) - timedelta(hours=24) | |
| if not until: | |
| until = datetime.now(timezone.utc) |
🤖 Prompt for AI Agents
In `@structured_logging.py` around lines 436 - 439, Replace the two uses of
datetime.utcnow() that set default ranges for since/until with timezone-aware
calls: use datetime.now(timezone.utc) - timedelta(hours=24) for since and
datetime.now(timezone.utc) for until; update or ensure the module imports
timezone from datetime (consistent with the rest of the file) so these
assignments in the block that sets defaults for since and until become
timezone-aware and match the existing pattern used elsewhere (see the existing
usage at line 147).
…gging for agent debugging
SUMMARY:
Fixed TypeScript build error in ProjectSetupRequired.tsx where startAgent
was being called with a boolean instead of an options object.
DETAILS:
- The startAgent API function signature was updated (in previous PR merges)
to accept an options object: { yoloMode?, parallelMode?, maxConcurrency?, testingAgentRatio? }
- ProjectSetupRequired.tsx was still calling it with the old signature:
startAgent(projectName, yoloMode) - passing boolean directly
- Changed to: startAgent(projectName, { yoloMode }) - wrapping in options object
This was the only remaining build error after merging 13+ PRs from upstream:
- PR leonvanzyl#112: Security vulnerabilities and race conditions
- PR leonvanzyl#89: Windows subprocess blocking fix
- PR leonvanzyl#109: Rate limit handling with exponential backoff
- PR leonvanzyl#88: MCP server config for ExpandChatSession
- PR leonvanzyl#100: Diagnostic warnings for config loading
- PR leonvanzyl#110: Quality gates (quality_gates.py)
- PR leonvanzyl#113: Structured logging (structured_logging.py)
- PR leonvanzyl#48: Knowledge files support (API, schemas, prompts)
- PR leonvanzyl#29: Feature editing/deletion (MCP tools)
- PR leonvanzyl#45: Chat persistence
- PR leonvanzyl#52: Refactoring feature guidance
- PR leonvanzyl#4: Project reset functionality
- PR leonvanzyl#95: UI polish, health checks, cross-platform fixes
Build now passes: npm run build succeeds with all 2245 modules transformed.
- Escape LIKE wildcards (% and _) in search parameter to prevent unexpected SQL query behavior - Replace deprecated datetime.utcnow() with datetime.now(timezone.utc) - Fix fragile test assertion to check level field instead of message Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@structured_logging.py`:
- Around line 360-365: The search branch escapes LIKE wildcards into
escaped_search and appends the param, but the SQL fragment in conditions
("message LIKE ?") lacks an ESCAPE clause so SQLite won't treat backslashes as
escape characters; update the condition to include an explicit ESCAPE clause
(e.g., "message LIKE ? ESCAPE '\\'") when adding to conditions, keep using
escaped_search for params, and ensure the chosen escape character in the ESCAPE
clause matches the character used in escaped_search.
🧹 Nitpick comments (1)
structured_logging.py (1)
540-548: Consider handling empty logs case for CSV export.When
logsis empty, the code correctly skips CSV creation (due toif logs:), but this leaves no file created. This is consistent with other formats, but consider creating an empty file with just headers for CSV consistency with tools that expect a file to exist.Optional: Create CSV with headers even when empty
elif format == "csv": import csv - if logs: - with open(output_path, "w", newline="") as f: + with open(output_path, "w", newline="") as f: + fieldnames = ["id", "timestamp", "level", "message", "agent_id", "feature_id", "tool_name", "duration_ms", "extra"] + if logs: + fieldnames = logs[0].keys() writer = csv.DictWriter(f, fieldnames=logs[0].keys()) - writer.writeheader() + writer.writeheader() + if logs: writer.writerows(logs)
Resolved conflicts: - agent.py: Keep RATE_LIMIT_PATTERNS and exponential backoff - autonomous_agent_demo.py: Keep logger null checks Includes CodeRabbit fixes: - Escape LIKE wildcards in search - Use datetime.now(timezone.utc) instead of utcnow() - Fix fragile test assertion Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The LIKE pattern escapes % and _ characters, but SQLite won't interpret \% and \_ as literal characters unless the query includes an ESCAPE clause. Also escape backslashes to prevent injection. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@structured_logging.py`:
- Around line 147-149: The timestamp normalization is inconsistent: log records
use .isoformat().replace("+00:00", "Z") but the since/until range values use
plain .isoformat(), causing mismatches; update all timestamp generation sites in
structured_logging.py (the log record formatting block shown and the two other
occurrences around the range filter logic) to produce UTC Zulu timestamps
consistently — e.g., replace .isoformat() with either
.isoformat().replace("+00:00", "Z") or .strftime("%Y-%m-%dT%H:%M:%SZ") so that
the log formatter (where timestamp=datetime.now(timezone.utc)... in the
record-building code) and the functions that build since/until range filters
produce identical "Z" suffixed timestamps.
- Around line 35-36: LogLevel currently defines "warn" but persisted entries use
record.levelname.lower() which yields "warning", causing query(level="warn") to
miss matches; update the code paths that persist/serialize log levels (where
record.levelname.lower() is used) to normalize "warning" → "warn" before
storing, or alternatively expand the LogLevel Literal to include "warning" and
adjust query handling; ensure normalization is applied in the functions that
create stored entries and in any level comparison logic (e.g., the query
function) so filters using "warn" match stored records.
🧹 Nitpick comments (1)
structured_logging.py (1)
96-140: Use context managers for SQLite connections to avoid leaks on exceptions.If an insert or JSON serialization fails after the connection opens, the connection is never closed. Wrap connections with
with sqlite3.connect(...) as conn:so they always close.♻️ Suggested refactor
- conn = sqlite3.connect(self.db_path) - cursor = conn.cursor() + with sqlite3.connect(self.db_path) as conn: + cursor = conn.cursor() @@ - conn.commit() - conn.close() + conn.commit() @@ - conn = sqlite3.connect(self.db_path) - cursor = conn.cursor() + with sqlite3.connect(self.db_path) as conn: + cursor = conn.cursor() @@ - conn.commit() - conn.close() + conn.commit()Also applies to: 157-194
- Normalize log level "warning" to "warn" to match LogLevel type - Use consistent timestamp format with "Z" suffix for query parameters - Use context managers for SQLite connections to prevent leaks - Update tests to use normalized "warn" level Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@structured_logging.py`:
- Around line 489-491: In get_agent_stats(), the timestamp pushed to params uses
since.isoformat() which can omit the required "Z" suffix; convert since to UTC
and format it with a trailing "Z" before appending to params (e.g., ensure since
is timezone-aware via since.astimezone(datetime.timezone.utc) and produce an ISO
string with 'Z' by replacing '+00:00' with 'Z' or using strftime), then replace
the existing since.isoformat() usage in the where_clause / params handling so
the stored timestamp is normalized with the "Z" suffix.
🧹 Nitpick comments (4)
test_structured_logging.py (2)
82-86: Consider moving repeatedshutilimport to module level.The
import shutilstatement is repeated in everytearDownmethod across 6 test classes. Moving it to the module-level imports would be cleaner.Suggested improvement
import json import sqlite3 +import shutil import tempfile import threadingThen remove the
import shutilfrom eachtearDownmethod.Also applies to: 121-124, 213-216, 303-306, 361-364, 435-438
440-466: Test may be timing-sensitive due to cleanup occurring per-insert.The cleanup logic in
StructuredLogHandler.emit()deletes old entries after each insert when count exceedsmax_entries. Withmax_entries=10and 20 writes, the final count should be exactly 10 (not "at most 10"). Consider usingassertEqualfor a stricter assertion.Suggested stricter assertion
# Should have at most max_entries - self.assertLessEqual(count, 10) + self.assertEqual(count, 10)structured_logging.py (2)
182-194: Cleanup query runs on every insert, consider batching or threshold.The cleanup logic executes
SELECT COUNT(*)and potentially aDELETEon every single log insertion. Under high-throughput logging, this adds overhead. Consider running cleanup less frequently (e.g., every N inserts or when count exceedsmax_entries * 1.1).Example optimization
# Add instance counter self._insert_count = 0 self._cleanup_interval = 100 # Check every 100 inserts # In emit(), after INSERT: self._insert_count += 1 if self._insert_count >= self._cleanup_interval: self._insert_count = 0 cursor.execute("SELECT COUNT(*) FROM logs") # ... cleanup logic
305-309: Database connections not using context managers.The
LogQuerymethods open connections with_connect()and close them manually withconn.close(). If an exception occurs between connect and close, the connection could leak. Consider using context managers consistently.Example refactor for query()
- conn = self._connect() - cursor = conn.cursor() - # ... query logic ... - cursor.execute(query, params) - rows = cursor.fetchall() - conn.close() - return [dict(row) for row in rows] + with self._connect() as conn: + cursor = conn.cursor() + # ... query logic ... + cursor.execute(query, params) + rows = cursor.fetchall() + return [dict(row) for row in rows]Also applies to: 340-341, 392-393, 405-406, 426-427, 441-442, 466-467, 484-485, 511-512
structured_logging.py
Outdated
| if since: | ||
| where_clause = "timestamp >= ?" | ||
| params.append(since.isoformat()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timestamp format inconsistency in get_agent_stats() method.
Same issue - since.isoformat() lacks the "Z" suffix normalization.
🤖 Prompt for AI Agents
In `@structured_logging.py` around lines 489 - 491, In get_agent_stats(), the
timestamp pushed to params uses since.isoformat() which can omit the required
"Z" suffix; convert since to UTC and format it with a trailing "Z" before
appending to params (e.g., ensure since is timezone-aware via
since.astimezone(datetime.timezone.utc) and produce an ISO string with 'Z' by
replacing '+00:00' with 'Z' or using strftime), then replace the existing
since.isoformat() usage in the where_clause / params handling so the stored
timestamp is normalized with the "Z" suffix.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…gging for agent debugging
SUMMARY:
Fixed TypeScript build error in ProjectSetupRequired.tsx where startAgent
was being called with a boolean instead of an options object.
DETAILS:
- The startAgent API function signature was updated (in previous PR merges)
to accept an options object: { yoloMode?, parallelMode?, maxConcurrency?, testingAgentRatio? }
- ProjectSetupRequired.tsx was still calling it with the old signature:
startAgent(projectName, yoloMode) - passing boolean directly
- Changed to: startAgent(projectName, { yoloMode }) - wrapping in options object
This was the only remaining build error after merging 13+ PRs from upstream:
- PR leonvanzyl#112: Security vulnerabilities and race conditions
- PR leonvanzyl#89: Windows subprocess blocking fix
- PR leonvanzyl#109: Rate limit handling with exponential backoff
- PR leonvanzyl#88: MCP server config for ExpandChatSession
- PR leonvanzyl#100: Diagnostic warnings for config loading
- PR leonvanzyl#110: Quality gates (quality_gates.py)
- PR leonvanzyl#113: Structured logging (structured_logging.py)
- PR leonvanzyl#48: Knowledge files support (API, schemas, prompts)
- PR leonvanzyl#29: Feature editing/deletion (MCP tools)
- PR leonvanzyl#45: Chat persistence
- PR leonvanzyl#52: Refactoring feature guidance
- PR leonvanzyl#4: Project reset functionality
- PR leonvanzyl#95: UI polish, health checks, cross-platform fixes
Build now passes: npm run build succeeds with all 2245 modules transformed.
Summary
Add a comprehensive structured logging system that saves logs to SQLite at
{project_dir}/.autocoder/logs.dbfor debugging agent issues.structured_logging.py: SQLite-based logging with WAL mode for better concurrencytest_structured_logging.py: 28 unit tests covering all functionalityagent.py,parallel_orchestrator.py,autonomous_agent_demo.py,client.pyFeatures
Logged Events
Access Logs
Test Plan
python -m pytest test_structured_logging.py -v(28 tests pass)sqlite3 logs.db "PRAGMA journal_mode"returnswal🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.