07 May 2026 · 3 min
Eval suites for codebase-specific agent use
Most AI rollouts skip evals because they feel like overhead. A small, codebase-specific eval suite, built in an afternoon, is the cheapest way to keep model and prompt changes from becoming a vibes call.
CodexClaude CodeOpenAI
04 May 2026 · 3 min
Copilot rollout exclusion list
Most Copilot rollouts skip the content exclusion work because it feels boring. The teams that skip it discover the problem the first time something sensitive ends up in a suggestion.
GitHub CopilotMicrosoft Purview DLP
02 May 2026 · 3 min
First MCP server tool design
Building an MCP server is mostly an API design problem with one extra constraint: the caller is a model, not a person. Naming and arguments matter more than transport.
MCPAnthropicClaude Code
29 Apr 2026 · 3 min
Claude Code hooks that actually save time
Claude Code hooks are easy to over-engineer. The right four save real time and prevent the failure modes you actually hit in week one.
Claude CodeAnthropic
22 Apr 2026 · 3 min
Codex on a real repo
Codex is a repo-aware coding agent. Used carelessly it generates churn the team has to clean up. Used with scope and a real review gate, it ships work.
CodexOpenAI