This is how I use OpenAI Codex Agent

The idea of Codex is to run background tasks with ChatGPT’s new cloud-based coding agent. You can just connect a Github project to it and define coding tasks for the Agent.

It checks out the code, installs dependencies, edits the right files, runs tests, and shows me everything it did. You get full diffs and terminal outputs. If it passes, you can open a pull request (and merge it).

Each task runs in an isolated container with no internet access, which keeps things clean and safe. And I get full logs so I can review every step before merging.

Setup

Since Codex cuts the internet after you start with your tasks you need to configure everything you need beforehand.

To setup your environment you need to click in the topbar on Environment

Click on your desired Environment
And use the Setup script section

Example: Use pnpm as package manager in Codex

Example: Use vitest with Codex

When I started my initial smoke tests, Codex couldn't find the Vitest package. I had to add it directly to the install script.

1. Add clear names to guide Codex

The agent scans files with plain text search. A unique function or file name points it to the right place at once.

2. Show the exact spot for the task

I limit each job to one folder, file, or even a single function. A short prompt with that path removes guessing.

Refactor src/bots/chatRouter.ts and simplify routeChatMessage.

3. Paste the full stack trace

For bugs I paste every line of the error so Codex lands on the failing line.

TypeError: Cannot read properties of undefined (reading 'id')
    at Object.formatAuthor (src/chat/chatManager.ts:45:22)

4. Fire off several tasks in parallel

Jobs that touch different files can run at the same time. The interface shows separate logs, and I jump between them to watch progress.

Task A Add Vitest tests for src/chat/context.ts
Task B Remove ESLint warnings in components/chat/

5. Give a clear pass or fail check

Tests, a linter, or the TypeScript compiler give Codex a finish line. The agent stops only when the check is green.

Implement countTokens in src/utils/tokenizer.ts. Every test in tokenizer.test.ts must pass.

6. Break large work into steps

Big features turn into a string of small pull requests. Each step has its own tests.

Add interface ChatMessage to src/types.ts
Migrate chatStore.ts to use ChatMessage
Update API calls in src/api/chat.ts

7. Hand off blockers

When a problem stalls me, I open a branch, describe the issue, and ask Codex for fixes or new ideas.

Optimise summariseConversation in src/chat/summary.ts for lower latency. Remove every any type.

8. Launch slow jobs first thing

I kick off a full test run or a repo wide linter pass at the start of the day. They finish during my mail check, then I review the diff.

Run npm run lint:fix across the repo. Commit the formatted code.

9. Feed project rules to AGENTS.md

Like Cursor Rules AGENTS.md tells Codex rules and coding guidelines for our project. I list our logger, preferred patterns, and testing rules. The agent reads it before writing code, so new files follow our style without reminders.

Codex merges up to three AGENTS.md layers, in this order:
1 ~/.codex/instructions.md – personal guidance
2 AGENTS.md at the repo root – shared project rules
3 AGENTS.md in the current folder – feature-specific tweaks
The later files override earlier ones.GitHub

Keep the file short, relevant, and version-controlled. A good template:

10. Mock every external call in tests

Codex often writes code that fetches live data. In tests I replace those calls with Mock Service Worker or vi.fn. No internet means deterministic results and faster CI.

import { setupServer } from 'msw/node';
import { rest } from 'msw';

const server = setupServer(
  rest.get('https://api.example.com/data', (_req, res, ctx) =>
    res(ctx.json({ id: 123, value: 'test data' })),
  ),
);

beforeAll(() => server.listen({ onUnhandledRequest: 'error' }));
afterEach(() => server.resetHandlers());
afterAll(() => server.close());

11. Let pull request workflows gate merges

A GitHub Action runs tests and linting on every pull request. Nothing reaches main unless the workflow is green.

name: PR CI
on: pull_request
jobs:
  ci:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - run: npm ci
      - run: npm test

Stay the senior in the loop

Codex writes drafts. I review design, security, and edge cases. Strong types, linters, and automated tests are the guardrails, but final responsibility is mine.

In my opinion, Codex works best when it's used for support tasks alongside your main work. It handles background jobs well, especially things like small improvements, maintenance fixes, or open issues that aren't time-sensitive. You can hand off 3 to 5 tasks and come back later to review the results.

It's probably less suited for moments when you're deep in the code and need fast feedback or constant iteration. But as a quiet assistant that keeps progress moving while you focus on other things, it's already proving useful.

Have a look at my other interesting blogs:

Table of Contents