What this is
Tool focus: OpenAI Codex
Hands-on training on OpenAI Codex for repo-aware agent work: bounded scope, review gates, eval suites and Microsoft 365 guardrails.
Codex is a repo-aware coding agent. Used well it ships real diffs against your codebase. Used poorly it floods the team with churn and hard-to-review changes.
Codex review gate · access graduates from read-only to merge.
What this is
Hands-on training on OpenAI Codex for repo-aware agent work: bounded scope, review gates, eval suites and Microsoft 365 guardrails.
Multi-file refactors with test coverage, deterministic migrations, internal tool builds, and tightly-scoped feature work where the diff and the tests are the deliverable. Less useful for ad-hoc free exploration; better when the work has a shape.
Read-only access to the repository first. Write access only inside a working branch with explicit gates. The agent is given a small task and a small context, never the whole codebase as a sandbox.
A written review checklist per change category. Tests run before any merge. The diff, not the prompt, is the artefact under review. PR templates name the gates so reviewers do not skim past them.
A small, codebase-specific evaluation set is built early — three to five tasks with known good outputs. Every model or workflow change is run against it before it touches production code.
Where the team is on a regulated estate, the rollout pairs with the M365 side: identity boundary, conditional access on the agent host, DLP on copy-paste, and a written acceptable-use note that matches the tooling rather than fighting it.
Day 0
Read-only review
Repository, current AI use, and CI shape are mapped. We agree the first three Codex-suited tasks and the review gates for them.
Day 1
Workflow design
Half day on workflow design, prompt patterns, tool short list, scope boundaries, and the written review checklist for each change category.
Day 2
On the real repo
Half day inside the actual codebase. Real Codex runs on the agreed tasks. Diffs reviewed, tests run, branches merged or rolled back as the gates dictate.
Weeks 2 to 4
Rollout window
Light coaching, eval suite carried forward by the team, written rules of engagement, and a follow-up check-in to confirm the workflow is sticking.
Yes, with named scope and access boundaries. The training treats access as a graduated thing: read-only first, then a working branch, then merge with a review gate. The repository is never handed over wholesale.
Copilot lives inside an editor and helps with line-level work. Codex is a repo-level agent that does multi-file changes, often without a human at every keystroke. The discipline around review and scope matters more.
A written workflow, a tool short list, a review checklist tied to change categories, and a small codebase-specific eval suite. The eval suite is the highest-value durable artefact.