Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 13 additions & 14 deletions content/en/post/series/agentic_ai/ai-coding-agent/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
+++
author = "Smaine Kahlouch"
title = "`Agentic Coding`: concepts and hands-on Platform Engineering use cases"
date = "2026-01-29"
date = "2026-02-06"
summary = "Exploring **agentic coding** through `Claude Code`: from fundamentals (tokens, MCPs, skills) to real-world use cases, with an enthusiastic yet honest take on this new way of working."
featured = true
codeMaxLines = 30
Expand Down Expand Up @@ -62,7 +62,7 @@ The cycle is simple: **reason → act → observe → repeat**. The agent calls

A coding agent combines several components:

* **LLM**: The "brain" that reasons (Claude Opus 4.5, Gemini 3 Pro, Devstral 2...)
* **LLM**: The "brain" that reasons (Claude Opus 4.6, Gemini 3 Pro, Devstral 2...)
* **Tools**: Available actions (read/write files, execute commands, search the web...)
* **Memory**: Preserved context (`CLAUDE.md`, `AGENTS.md`, `GEMINI.md`... depending on the tool, plus conversation history)
* **Planning**: The ability to break down a complex task into sub-steps
Expand All @@ -73,16 +73,16 @@ New models and versions appear at a breakneck pace. However, you need to be care

The [**SWE-bench Verified**](https://www.swebench.com/) benchmark has become the reference for evaluating model capabilities in software development. It measures the ability to solve real bugs from GitHub repositories and helps guide our choices.

{{< img src="swe-bench-leaderboard.png" width="900" >}}
{{< img src="swe-bench-leaderboard.png" width="750" >}}

{{% notice warning "These numbers change fast!" %}}
Check [swebench.com](https://www.swebench.com/) for the latest results. At the time of writing, Claude Opus 4.5 leads with **74.4%**, closely followed by Gemini 3 Pro (**74.2%**).
Check [vals.ai](https://www.vals.ai/benchmarks/swebench) for the latest independent results. At the time of writing, Claude Opus 4.6 leads with **79.2%**, closely followed by Gemini 3 Flash (**76.2%**) and GPT-5.2 (**75.4%**).
{{% /notice %}}

In practice, today's top models are all capable enough for most _Platform Engineering_ tasks.

{{% notice info "Why model choice matters" %}}
Boris Cherny, creator of Claude Code, shared his take on model selection:
Boris Cherny, creator of Claude Code, shared his take on model selection (about Opus 4.5 — the reasoning still holds):

{{< img src="boris-opus4.5.png" width="600" >}}

Expand All @@ -96,7 +96,7 @@ There are many coding agent options out there. Here are a few examples:

| Tool | Type | Strengths |
|------|------|-----------|
| [**Claude Code**](https://docs.anthropic.com/en/docs/claude-code) | Terminal | 200K context, high SWE-bench score, hooks & MCP |
| [**Claude Code**](https://code.claude.com/docs/en/overview) | Terminal | 200K context (1M in beta), high SWE-bench score, hooks & MCP |
| [**opencode**](https://opencode.ai/) | Terminal | **Open source**, multi-provider, local models (Ollama) |
| [**Cursor**](https://cursor.sh/) | IDE | Visual workflow, Composer mode |
| [**Antigravity**](https://antigravity.google/) | IDE | Parallel agents, Manager view |
Expand All @@ -109,15 +109,15 @@ I started with Cursor, then switched to Claude Code — probably because of my *

## :books: Essential Claude Code concepts

This section cuts straight to the point: **tokens, MCPs, Skills, and Tasks**. I'll skip the initial setup (the [official docs](https://docs.anthropic.com/en/docs/claude-code) cover that well) and subagents — that's internal plumbing; what matters is what you can *build* with them. Most of these concepts **also apply to other coding agents**.
This section cuts straight to the point: **tokens, MCPs, Skills, and Tasks**. I'll skip the initial setup (the [official docs](https://code.claude.com/docs/en/overview) cover that well) and subagents — that's internal plumbing; what matters is what you can *build* with them. Most of these concepts **also apply to other coding agents**.

### Tokens and context window

#### The essentials about tokens

A **token** is the basic unit the model processes — roughly 4 characters in English, 2-3 in French. Why does this matter? Because **everything costs tokens**: input, output, and context.

The **context window** (200K tokens for Claude) represents the model's "working memory". The `/context` command lets you see how this space is used:
The **context window** (200K tokens for Claude, up to 1M in beta) represents the model's "working memory". The `/context` command lets you see how this space is used:

```console
/context
Expand Down Expand Up @@ -332,7 +332,7 @@ For those steeped in Kubernetes, here's an analogy 😉: the spec defines the **
| Framework | Key strength | Ideal use case |
|-----------|-------------|----------------|
| **[GitHub Spec Kit](https://github.com/github/spec-kit)** | Native GitHub/Copilot integration | Greenfield projects, structured workflow |
| **[BMAD](https://github.com/bmad-sim/bmad-method)** | Multi-agent teams (PM, Architect, Dev) | Complex multi-repo systems |
| **[BMAD](https://github.com/bmad-code-org/BMAD-METHOD)** | Multi-agent teams (PM, Architect, Dev) | Complex multi-repo systems |
| **[OpenSpec](https://github.com/Fission-AI/OpenSpec)** | Lightweight, change-focused | Brownfield projects, rapid iteration |
{{% /notice %}}

Expand Down Expand Up @@ -516,7 +516,7 @@ If you work with sensitive or proprietary code:
- Request the **Zero-Data-Retention** (ZDR) option if needed
- **Never** use the Free/Pro plan for confidential code

See the [privacy documentation](https://www.anthropic.com/policies/privacy) for more details.
See the [privacy documentation](https://www.anthropic.com/legal/privacy) for more details.
{{% /notice %}}

### :bulb: Getting the most out of it
Expand All @@ -527,22 +527,21 @@ Tips and workflows I've picked up along the way (CLAUDE.md, hooks, context manag

### My next steps

This is a concern I share with many developers: **what happens if Anthropic changes the rules of the game?** This fear actually materialized in early January 2026, when Anthropic [blocked without warning](https://venturebeat.com/technology/anthropic-cracks-down-on-unauthorized-claude-usage-by-third-party-harnesses) access to Claude through third-party tools like [OpenCode](https://github.com/opencode-ai/opencode).
This is a concern I share with many developers: **what happens if Anthropic changes the rules of the game?** This fear actually materialized in early January 2026, when Anthropic [blocked without warning](https://venturebeat.com/technology/anthropic-cracks-down-on-unauthorized-claude-usage-by-third-party-harnesses) access to Claude through third-party tools like [OpenCode](https://github.com/charmbracelet/crush).

Given my affinity for open source, I'm looking at exploring open alternatives: **[Mistral Vibe](https://mistral.ai/news/devstral-2-vibe-cli)** with Devstral 2 (72.2% SWE-bench) and **[OpenCode](https://opencode.ai/)** (multi-provider, local models via Ollama) for instance.
Given my affinity for open source, I'm looking at exploring open alternatives: **[Mistral Vibe](https://mistral.ai/news/devstral-2-vibe-cli)** with Devstral 2 (72.2% SWE-bench) and **[Crush](https://github.com/charmbracelet/crush)** (formerly OpenCode) (multi-provider, local models via Ollama) for instance.

---

## :bookmark: References

### Guides and best practices
- [Claude Code Best Practices](https://www.anthropic.com/engineering/claude-code-best-practices) — Anthropic Engineering
- [How I Use Every Claude Code Feature](https://blog.sshh.io/p/how-i-use-every-claude-code-feature) — Comprehensive guide by sshh

### Spec-Driven Development
- [GitHub Spec Kit](https://github.com/github/spec-kit) — GitHub's SDD toolkit
- [OpenSpec](https://github.com/Fission-AI/OpenSpec) — Lightweight SDD for brownfield projects
- [BMAD Method](https://github.com/bmad-sim/bmad-method) — Multi-agent SDD
- [BMAD Method](https://github.com/bmad-code-org/BMAD-METHOD) — Multi-agent SDD

### Plugins, Skills and MCPs
- [Code-Simplifier](https://github.com/anthropics/claude-plugins-official/tree/main/plugins/code-simplifier) — AI code cleanup
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
47 changes: 43 additions & 4 deletions content/en/post/series/agentic_ai/ai-coding-tips/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
+++
author = "Smaine Kahlouch"
title = "A few months with `Claude Code`: tips and workflows that helped me"
date = "2026-01-29"
date = "2026-02-08"
summary = "CLAUDE.md, hooks, context management, worktrees, plugins, anti-patterns: everything I wish I'd known from the start."
featured = true
codeMaxLines = 30
Expand Down Expand Up @@ -143,7 +143,7 @@ I won't detail every variant here — the [official hooks documentation](https:/

## :brain: Mastering the context window

The context window (200K tokens) is **the most critical resource**. Once saturated, old information gets compressed and quality degrades. This is THE topic that makes the difference between an efficient user and someone who "loses" Claude after 20 minutes.
The context window (200K tokens, up to 1M in beta) is **the most critical resource**. Once saturated, old information gets compressed and quality degrades. This is THE topic that makes the difference between an efficient user and someone who "loses" Claude after 20 minutes.

### `/compact` with custom instructions

Expand Down Expand Up @@ -274,6 +274,46 @@ wait

Each instance has its own context. This is ideal for independent tasks that don't require interaction.

### Teams: letting agents work together

The `-p` approach works great for independent tasks, but sometimes you need agents that **talk to each other**. That's what teams are for — Claude spawns **multiple agents that share a task list**, exchange messages, and can wait on each other's results.

This is probably the `Opus 4.6` killer feature: you parallelize work, get better performance, and each agent has its own context instead of cramming everything into one session.

{{% notice note "Experimental feature" %}}
Agent teams are still in **research preview**. To enable them, add this to your `settings.json`:

```json
{
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
},
"teammateMode": "tmux"
}
```

The `teammateMode` controls how agents are displayed: `"tmux"` opens each agent in a separate pane (requires tmux), `"in-process"` runs them all in the same terminal, and `"auto"` (default) picks automatically based on your environment.
{{% /notice %}}

To illustrate the concept, I tried this on a deliberately simple case: analyzing a Crossplane composition. The example is basic, but real-world use cases are plentiful : Troubleshooting an incident by parallelizing log, metrics, and config analysis, creating a new service by splitting code, tests, and documentation across agents, or running a multi-component security audit.

Here, instead of reading the code, then the examples, then writing a summary (all sequentially in the same session), let's create a team as follows:

```
Create a team of 3 agents to analyze the Crossplane App composition.
1. code-reader: Read main.k and summarize what resources it creates
2. example-reader: Read all example files and list the configuration options
3. doc-writer: Wait for the other two, then write a combined summary
```

Here's what it looks like in practice: three tmux panes running side by side — the two readers analyzing the code and examples in parallel, while the doc-writer waits for their output before producing the final summary.

{{< img src="teams.png" alt="Claude Code Teams - 3 agents working in tmux panes" width="1200" >}}

{{% notice tip "When teams are worth it" %}}
Teams shine when you have **independent work that needs to come together** — read/analyze/summarize, build/test/document, that kind of thing. For purely sequential tasks, a single session with `/clear` between steps is simpler.
{{% /notice %}}

---

## :desktop_computer: Hybrid IDE + Claude Code workflow
Expand Down Expand Up @@ -350,8 +390,7 @@ These questions, as well as the methods that help me stay in control, are covere
## :bookmark: References

### Official documentation
- [Claude Code Best Practices](https://www.anthropic.com/engineering/claude-code-best-practices) — Anthropic Engineering
- [Claude Code Documentation](https://docs.anthropic.com/en/docs/claude-code) — Official guide
- [Claude Code Documentation](https://code.claude.com/docs/en/overview) — Official guide
- [Hooks Documentation](https://docs.anthropic.com/en/docs/claude-code/hooks) — Hooks configuration

### Community guides
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading