Article Detail

Building a Terminal SWE Agent from Scratch: A 10-Iteration Guide

2026-05-14MDX POCen

<>

There’s a category of software that feels magical until you understand how it works, and then it feels inevitable. AI coding agents are like that. You watch Claude Code or similar tools read files, run commands, and edit code based on a natural language description — and it looks like sorcery.

It’s not. Under the hood, it’s a loop: call an LLM, check if it wants to use tools, execute those tools, feed the results back, and repeat until the model is satisfied. The complexity is in the details — streaming, tool definitions, permissions, state management, and the terminal UI that stitches it all together.

This guide walks through building a terminal-based SWE agent in 10 iterations. You’ll go from bun init to a working agent with streaming AI responses, tool execution, slash commands, permissions, and state management. The target audience is React engineers who haven’t built AI agents before.

The full implementation runs about 15,000-20,000 lines of TypeScript and takes roughly 2-3 months of full-time work.

<>

The project is called my-swe-agent. It runs in the terminal using React 19 + Ink 5 for the UI, connects to LLMs through the Anthropic Messages API (or a proxy), and executes tools like file reads, shell commands, and code search in a feedback loop with the model.

The architecture follows a simple loop:

User input → API call → Stream response → Tool execution → Feed back → Repeat

Prerequisites

You should be comfortable with React (hooks, function components), TypeScript (generics, utility types), and async programming (Promise, async/await, for await…of). You don’t need prior experience with Ink, Anthropic API, CLI design patterns, or AI agent architectures — we’ll cover those along the way.

Iteration 1: Minimal REPL Shell

The first step is getting characters on the screen. We’ll build a minimal Ink app that captures keyboard input and displays it.

#!/usr/bin/env bun
import React, { useState } from "react";
import { render, Text, Box, useInput, useApp } from "ink";

function REPL() {
  const { exit } = useApp();
  const [input, setInput] = useState("");
  const [history, setHistory] = useState<string[]>([]);

  useInput((_input, key) => {
    if (key.return && input.trim()) {
      setHistory((h) => [...h, `> ${input}`, `[Echo] ${input}`]);
      setInput("");
    } else if (key.escape) {
      exit();
    } else if (key.backspace || key.delete) {
      setInput((i) => i.slice(0, -1));
    } else if (_input) {
      setInput((i) => i + _input);
    }
  });

  return (
    <Box flexDirection="column" height="100%">
      <Box flexDirection="column" flexGrow={1}>
        {history.map((line, i) => (
          <Text key={i}>{line}</Text>
        ))}
      </Box>
      <Box>
        <Text>&gt; {input}</Text>
      </Box>
    </Box>
  );
}

render(<REPL />);

The Ink framework is React for the terminal. <Text> is like a <span>, <Box> is like a <div> with flexbox. useInput captures keyboard events, and useApp().exit() shuts down the application.

Iteration 2: Streaming API Integration

Next, we connect to an LLM. The guide uses a local proxy (cc-switcher) that bridges the Anthropic Messages API format to various backends:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: "sk-proxy-managed",
  baseURL: process.env.ANTHROPIC_BASE_URL || "http://127.0.0.1:5001/proxy/opencode-go",
});

export async function* queryModel(params: {
  system: string;
  messages: Array<{ role: "user" | "assistant"; content: string }>;
}): AsyncGenerator<string> {
  const stream = await client.messages.create({
    model: process.env.SWE_MODEL || "deepseek-v4-flash",
    max_tokens: 8192,
    system: params.system,
    messages: params.messages,
    stream: true,
  });

  for await (const event of stream) {
    if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
      yield event.delta.text;
    }
  }
}

The REPL is updated to handle streaming responses — text appears character by character as the model responds:

const handleSubmit = async () => {
  setMessages((m) => [...m, userMsg]);
  let fullResponse = "";
  setStreamingText("...");

  const stream = queryModel({
    system: "You are a helpful coding assistant.",
    messages: [...messages, userMsg],
  });

  for await (const chunk of stream) {
    fullResponse += chunk;
    setStreamingText(fullResponse);
  }

  setMessages((m) => [...m, { role: "assistant", content: fullResponse }]);
  setStreamingText("");
};

Key insight: conversation context is managed manually. Each query sends the full message history so the model knows what’s been said. This means the message array grows with every turn.

Iteration 3: BuildTool Factory and Basic Tools

Tools are how the LLM interacts with the outside world. Every tool follows the same factory pattern:

export function buildTool<D extends ToolDef>(def: D): Tool<D> {
  return {
    ...TOOL_DEFAULTS,
    ...def,
  } as Tool<D>;
}

const TOOL_DEFAULTS = {
  description: async () => "",
  prompt: async () => "",
  isReadOnly: () => false,
  isConcurrencySafe: () => false,
  isEnabled: () => true,
};

This means adding a new tool requires implementing only the parts that differ — name, input schema, execution logic, and optionally permission checks and concurrency flags. Everything else has sensible defaults.

BashTool

export const BashTool = buildTool({
  name: "Bash",
  inputSchema: z.object({
    command: z.string().describe("The shell command to execute"),
    description: z.string().optional(),
    timeout: z.number().optional(),
  }),
  isConcurrencySafe: () => false,
  async call({ command, timeout }) {
    const result = await execa("bash", ["-c", command], {
      timeout: timeout ?? 120_000,
      reject: false,
    });
    return {
      data: { stdout: result.stdout ?? "", stderr: result.stderr ?? "", exitCode: result.exitCode ?? 0 },
    };
  },
});

FileReadTool

export const FileReadTool = buildTool({
  name: "Read",
  inputSchema: z.object({
    file_path: z.string().describe("Absolute path to the file"),
    offset: z.number().optional(),
    limit: z.number().optional(),
  }),
  isReadOnly: () => true,
  isConcurrencySafe: () => true,
  async call({ file_path, offset, limit }) {
    const content = await readFile(file_path, "utf-8");
    const lines = content.split("\n");
    const start = offset ?? 0;
    const end = limit ? start + limit : lines.length;
    return { data: { content: lines.slice(start, end).join("\n"), totalLines: lines.length } };
  },
});

The tool registry collects all tools into a single list:

export function getAllTools(): Tool[] {
  return [BashTool, FileReadTool, FileWriteTool, GlobTool, GrepTool];
}

Iteration 4: Tool Execution Loop

This is the heart of the agent. The queryLoop function is a while(true) that orchestrates the LLM-tool feedback cycle:

export async function* queryLoop(params: QueryParams): AsyncGenerator<SDKMessage | StreamEvent> {
  while (true) {
    // Phase 1: API call
    const toolUseBlocks: ToolUseBlock[] = [];
    let currentText = "";

    const stream = await anthropic.messages.create({
      model,
      system,
      messages,
      tools: tools.map(formatTool),
      stream: true,
    });

    // Phase 2: Process stream events
    for await (const event of stream) {
      if (event.type === "content_block_start" && event.content_block.type === "tool_use") {
        toolUseBlocks.push({
          id: event.content_block.id,
          name: event.content_block.name,
          input: event.content_block.input as Record<string, unknown>,
        });
      }
      if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
        currentText += event.delta.text;
        yield { type: "text", text: event.delta.text };
      }
    }

    // Phase 3: No tool calls → done
    if (toolUseBlocks.length === 0) {
      messages.push({ role: "assistant", content: currentText });
      return;
    }

    // Phase 4: Execute tools
    // ... build assistant message with tool_use blocks ...
    // ... execute each tool, yield results ...
    // ... push tool results as new user message ...

    // Phase 5: Feed results back → loop continues
    messages.push({ role: "user", content: toolResults });
    // Back to while(true)
  }
}

The critical design insight: tool results are pushed as a user message with tool_result content blocks. This makes the model “see” what happened when it ran the tool. If it’s satisfied, it responds with text. If it wants to do more, it issues new tool calls.

Parallel Tool Execution

For tools marked isConcurrencySafe, the engine can run them in parallel:

async function executeToolsConcurrently(toolCalls: ToolUseBlock[]) {
  const [safe, unsafe] = partition(toolCalls, (tb) =>
    findTool(tb.name)?.isConcurrencySafe() ?? false,
  );
  const safeResults = await Promise.all(safe.map(executeSingleTool));
  const unsafeResults = [];
  for (const tb of unsafe) unsafeResults.push(await executeSingleTool(tb));
  return [...safeResults, ...unsafeResults];
}

Iteration 5: Input Processing and Command System

Not everything goes to the LLM. Slash commands (/clear, /help, /review) are handled locally:

type Command =
  | PromptCommand & { type: "prompt"; getPromptForCommand(args: string): ContentBlockParam }
  | LocalCommand & { type: "local"; call(args: string): Promise<string | undefined> }
  | LocalJSXCommand & { type: "local-jsx"; render(onDone: () => void): React.ReactNode };
  • PromptCommand — Generates a prompt, sends to LLM (e.g., /commit generates a commit message and asks the model to create the commit)
  • LocalCommand — Runs in-process, returns text (e.g., /clear, /version)
  • LocalJSXCommand — Returns a React/Ink component for interactive UI (e.g., /config renders a settings editor)

Iteration 6: System Prompt Construction

The system prompt is dynamically assembled from multiple sources — tool descriptions, environment info, project memory (CLAUDE.md), and behavior rules. This isn’t a static string; it’s rebuilt on each turn to reflect the current context.

export function buildSystemPrompt(context: PromptContext): string {
  return [
    getIntroduction(),
    getBehaviorRules(),
    getTaskGuidelines(),
    getToolDescriptions(context.tools),
    getEnvironmentInfo(),
    getMemoryContent(context.cwd),
  ].join("\n\n");
}

Each tool contributes its own prompt() output, which is a natural-language description of what the tool does and when to use it. This is also sent as the description field in the API’s tool definition, so the model sees it both in the system prompt and as part of the function-calling schema.

Iteration 7: Permission System

Every tool invocation goes through a permission check chain. The system supports multiple modes:

Mode Behavior
default Prompt the user for each destructive operation
acceptEdits Auto-allow file edits in current directory
bypassPermissions Auto-approve everything
dontAsk Treat all ASK results as DENY
plan Pause all tool execution

Permission rules use glob-like patterns: Bash(git *) matches git commands, FileEdit(/src/*) matches edits in the src/ directory. Rules can be configured at multiple levels — global (~/.claude/settings.json), project (.claude/settings.json), or per-session.

function hasPermissionsToUseTool(tool, input, context): PermissionResult {
  // 1. Global deny rules → immediate DENY
  // 2. Tool-specific permissions → tool.checkPermissions()
  // 3. Mode check → bypass/plan/etc
  // 4. Always-allow rules → ALLOW
  // 5. Default → ASK (prompt user)
}

Iteration 8: State Management and UI Enhancement

State management uses a minimal custom Store (no Redux, no Zustand — just a simple pub/sub with useSyncExternalStore):

export function createStore<T>(initialState: T) {
  let state = initialState;
  const listeners = new Set<Listener>();
  return {
    getState: () => state,
    setState: (updater: (prev: T) => T) => {
      const next = updater(state);
      if (Object.is(next, state)) return;
      state = next;
      listeners.forEach(l => l());
    },
    subscribe: (listener: Listener) => {
      listeners.add(listener);
      return () => listeners.delete(listener);
    },
  };
}

The UI components are familiar to any React developer: MessageBubble for user/assistant messages, ToolCallDisplay for showing tool invocations, Spinner for loading states, StatusBar for the bottom status line showing provider/model info. All connect to the store through the useAppState selector hook.

Iteration 9: File Editing and Project Tools

The FileEditTool implements precise string replacement — the most common edit operation in code assistants:

async call({ file_path, old_string, new_string }) {
  const content = await readFile(file_path, "utf-8");
  const count = content.split(old_string).length - 1;
  if (count === 0)  return { isError: true, data: { message: `String not found` } };
  if (count > 1)    return { isError: true, data: { message: `Found ${count} occurrences` } };
  await writeFile(file_path, content.replace(old_string, new_string), "utf-8");
  return { data: { message: `Applied edit to ${file_path}` } };
}

The strict “match exactly once” requirement prevents accidental replacements. If the model gets it wrong (string not found, or matches multiple times), the error message guides it to read the file again and try with more context.

Iteration 10: Advanced Features

The final iteration adds:

  • Token counting and cost tracking — Logging input/output tokens and estimating cost per session
  • Context compression — When conversations get long, auto-summarize older turns using a cheaper model
  • Sub-agents — The ability to spawn independent agent instances for parallel tasks
  • MCP integration — Connecting external tool servers through the Model Context Protocol
  • Skill system — Loadable markdown files that inject specialized prompts

Testing Strategy

The reference implementations in this project (Claude Code) have no test suite — it was leaked source without tests. This project should be different. Key testing areas:

  1. Unit tests for tools — Mock the filesystem, verify BashTool’s permission checks
  2. Agent loop tests — Mock the LLM to return known tool call sequences, verify the engine executes them correctly
  3. ProviderTransform tests — Verify message normalization for each provider’s quirks
  4. Permission system tests — Verify rule matching, priority ordering, and edge cases
// Example: BashTool security test
describe('BashTool', () => {
  it('blocks sudo commands', async () => {
    const result = await BashTool.checkPermissions(
      { command: 'sudo rm -rf /', description: '', timeout: 5000 },
      mockContext
    )
    expect(result.allowed).toBe(false)
  })
})

What I Learned Building This

The agent loop looks simple in pseudocode — call LLM → execute tools → repeat — but the implementation complexity sneaks in through the details. Streaming event handling, message format normalization, permission chains, state synchronization between the engine and the UI, and the sheer number of edge cases in tool execution all add up.

Three things stand out as the most important architectural decisions:

  1. The buildTool factory pattern pays compounding returns. Every new tool you add follows the same structure, and the defaults handle most of the boilerplate.

  2. The permission system needs to be designed early, not bolted on later. Every tool invocation goes through the same check chain, and retrofitting permissions onto existing tools is more work than building it in from the start.

  3. State management between the engine loop and the UI is the main source of bugs. The store pattern (pub/sub + useSyncExternalStore) handles this cleanly, but it needs to be designed carefully — the engine yields events, the UI subscribes to state changes, and keeping them in sync requires discipline.

The full source for this project is available in the my-swe-agent/ directory. Each iteration builds on the previous one, so you can follow along commit by commit.