Tests first. Always. A systematic approach to AI-assisted development that enforces test-driven development and maintains clean git history through delegation and autonomous execution.
There are 3 of us: You (the human), Droid Assistant (planner), and Droid Exec (code executor). Each reads specific documents to know what to do. This separation of concerns is what makes TDD enforcement possible.
This workflow ensures tests are written first, implementation follows tests, and git history stays clean and reviewable. Here's how it works in practice:
Start by talking to Droid Assistant about what you want to build. Be conversationalβ it's designed to ask clarifying questions and understand context.
Droid Assistant creates a detailed task spec in tasks/ and commits it
before any code is written. This ensures clean git history.
# Droid creates the spec vim tasks/add-rate-limiting.md # Commits the spec FIRST (separate from implementation) git add tasks/add-rate-limiting.md git commit -m "spec: add rate limiting task"
Droid Assistant never writes code directly. It delegates to Droid Exec, which is configured to always write tests first.
# Droid Assistant runs this command: droid exec --auto medium -f tasks/add-rate-limiting.md # Droid Exec starts executing the spec...
This is the key to TDD enforcement. Droid Exec's guidelines in AGENTS.md mandate test-first development. Tests are written, they fail (no implementation yet), then code is written to make them pass.
# Droid Exec creates tests first
# server/tests/rateLimit.test.ts
describe('Rate Limiting', () => {
test('allows requests within limit', async () => {
// Test implementation...
});
test('blocks requests exceeding limit', async () => {
// Test implementation...
});
});
# Runs tests: npm run check
# Expected: Tests FAIL (no implementation yet) Only after tests are written and failing does Droid Exec write the actual implementation. Tests guide the code structure.
# Now Droid Exec implements to make tests pass
# server/middleware/rateLimit.ts
export const rateLimit = (options) => {
// Implementation that satisfies the tests...
};
# Runs tests again: npm run check
# Expected: Tests PASS (implementation correct) Droid Assistant reviews what Droid Exec produced, summarizes the changes, and waits for your feedback. If revisions are needed, the spec is updated and Droid Exec runs again.
# Droid Assistant reviews git log --oneline -3 cat server/middleware/rateLimit.ts npm run check # Verify tests pass # Summarizes for you "Rate limiting implemented with tests: - Added middleware: server/middleware/rateLimit.ts - Tests: server/tests/rateLimit.test.ts (4 tests, all passing) - Applied to routes: /api/v1/* - Tests confirm 100 req/min limit enforced Ready for your review!"
The separation of rolesβHuman (manager), Droid (planner), Droid Exec (executor)βcreates a system where TDD is structurally enforced, not just encouraged.
Droid Exec's guidelines mandate tests first. It's not optionalβit's baked into the workflow. Tests fail, then code makes them pass. Always.
Three-commit pattern: (1) Spec, (2) Implementation (tests + code), (3) Completion. Easy to review, easy to revert, easy to understand what changed and why.
Droid Exec's work is in separate commits with co-authorship attribution. You can see exactly what AI produced vs manual changes. No hidden modifications.
Feedback loop is instant. You test, spot issues, provide feedback. Droid updates the spec and re-delegates. Iterate until perfect.
Tests written first, always. No exceptions. Droid Exec follows AGENTS.md guidelines that mandate TDD. Red (fail) β Green (pass) β Refactor.
Three commits per feature: (1) Spec, (2) Implementation, (3) Completion. Commits before delegation ensure clean separation. Easy to review, easy to revert.
You test β Provide feedback β Droid revises spec β Droid Exec re-executes β Repeat until satisfied. Continuous improvement through iteration.
Droid commits spec before delegating to Droid Exec. This ensures clean git history and separates planning from execution. Reviewable at every step.
Two key files enable this pattern: AGENTS.md (tells Droid how to work) and task specs (tell Droid Exec what to build). Here's how to set them up:
This file tells Droid Assistant its role (planner who delegates) and tells Droid Exec its role (executor who writes tests first). Critical for the pattern to work.
# Create project structure mkdir tasks touch AGENTS.md PROJECT_STATUS.md # AGENTS.md must document: # - 3 entities (Human, Droid, Droid Exec) # - Delegation rules (>20 lines = delegate) # - TDD requirement (tests first, always) # - Git workflow (commit before delegate)
π Need a template? Check the AGENTS.md template on GitHub
Once AGENTS.md is set up, test that Droid Assistant understands the workflow. Start a droid session and ask this question:
droid # Then ask: > "we are 3 of us here right? who are they and what are their roles?"
Expected response: Droid should explain the 3 entities (You, Droid, Droid Exec) and their roles. If it doesn't understand, AGENTS.md needs refinement.
Now you're ready! Just talk to Droid about what you want to build. It will handle spec creation, delegation, and TDD enforcement.
# Example conversation: You: "Add user authentication with JWT tokens" Droid: "I'll create a spec. Should I include: - Login/logout endpoints - Token refresh mechanism - Password hashing with bcrypt - Tests for all auth flows?" You: "Yes, all of that" Droid: 1. Creates tasks/add-auth.md 2. Commits spec first 3. Runs: droid exec --auto medium -f tasks/add-auth.md 4. Droid Exec writes tests first 5. Droid Exec implements to pass tests 6. Droid reviews & reports back You: "Perfect! Now let's add password reset..."
This website was built using the TDD workflow pattern. Not a theoretical exampleβactual work completed through delegation and test-first development.
βββ Initial Conversation βββ Human: I want to showcase my droid workflow on a website using Cloudflare Pages Droid: I'll create a spec for this. Let me verify wrangler is installed first. $ wrangler --version # β wrangler 4.42.2 βββ Step 1: Droid Creates Spec βββ Droid: Creating task spec... $ vim tasks/showcase_website.md # Part of tasks/showcase_website.md: # Create single-page HTML site to showcase human, droid, and droid exec collaboration # - Factory.ai design theme (black bg, orange #ff6b35) # - Sections: Workflow diagram, Getting Started, Benefits # - Deploy to Cloudflare Pages via wrangler # # Success criteria: # - Site deploys successfully to Cloudflare Pages # - All sections render correctly # - Responsive design works on mobile βββ Step 2: Droid Commits Spec First βββ Droid: Committing spec before delegation... $ git add tasks/showcase_website.md $ git commit -m "spec: add showcase website task" βββ Step 3: Droid Delegates to Droid Exec βββ Droid: Delegating to Droid Exec... $ droid exec --auto medium -f tasks/showcase_website.md # Droid Exec: Creates site/index.html β Builds HTML/CSS β Finishes βββ Step 4: Droid Reviews Output βββ Droid: Reviewing what Droid Exec created... $ cat site/index.html # β HTML looks good - Factory.ai theme, all sections present βββ Step 5: Droid Tests Deployment βββ Droid: Testing deployment... $ wrangler pages deploy site/ --project-name droid-showcase # β¨ Deployment complete! https://abc123.droid-showcase.pages.dev Droid: Verifying site is live... $ curl -I https://abc123.droid-showcase.pages.dev # HTTP/2 200 OK β βββ Step 6: Droid Commits Work βββ Droid: Everything looks good. Committing... $ git add site/ $ git commit -m "feat: add showcase website (via droid exec)" βββ Step 7: Human Tests & Iteration βββ Human: The site looks great! Can you make the buttons equal width? Droid: Sure! I'll update the spec and re-delegate to Droid Exec...
Get started with Factory.ai and experience test-driven development that's structurally enforced, not just encouraged.