Designing a Real-Time Preview System for Agent Development

Watching an AI agent write code is like watching someone work in a black box. You see the final output, but not the process. Did it try three approaches before settling on this one? Did it hit errors? Is it stuck?

For Colony, this was unacceptable. We’re building a system where multiple agents work simultaneously on different parts of a project. Users need to see what’s happening in real time, not just review git diffs after the fact.

This is how we built Colony’s preview system: a plugin-based architecture that shows web UIs, terminal output, and agent events — all updating live as agents work.

The Observability Problem

Traditional dev tools assume one developer in one terminal. You run npm run dev, the output appears in that window. You open localhost:3000 in a browser. Simple.

Multi-agent development breaks this. Five agents working on five services simultaneously? You can’t track them with five terminal windows. You need:

Aggregated logs across all agents and services
Per-service previews (not just one localhost:3000, but Agent A’s port 3000, Agent B’s port 3001, etc.)
Live terminal output from each agent’s shell
Streaming events showing what agents are doing

Colony’s preview system treats each service as a first-class preview target. Agent starts a web server? Colony automatically creates a preview tab for it. Agent opens a terminal? Colony streams the output to a dedicated terminal preview.

Service-Driven Preview Tabs

The key insight: Colony knows what services exist before agents start them. Every colony has a colony.toml configuration:

[colony]
name = "hello-colony"
description = "Demo project with web + API services"

[[service]]
name = "web"
port = 4001
start_command = "npm run dev"

[[service]]
name = "api"
port = 4002
start_command = "node server.js"

When you open a colony in Bloom (our web dashboard), Colony parses this and creates preview tabs automatically:

Web tab: iframe pointed at web-4001.colony.local
API tab: iframe pointed at api-4002.colony.local
Agent tab: streaming events from the agent executor
Terminal tab: live xterm.js terminal

You don’t configure tabs or remember which ports agents are using. The tabs appear as soon as the colony is created, ready to display output as soon as services start.

Web Preview Architecture

The web preview is an iframe that displays a running service. With several non-obvious design decisions.

Sandboxed Iframes

We use sandboxed iframes without allow-same-origin:

<iframe
  src={previewUrl}
  sandbox="allow-scripts allow-forms allow-modals"
  className="w-full h-full border-0"
/>

Why no allow-same-origin? Because the iframe loads code written by AI agents. We don’t fully trust it. Without allow-same-origin, the iframe:

Can’t access localStorage or cookies from the parent domain
Can’t make requests to the parent domain
Can’t use postMessage to communicate with the parent

This prevents malicious or buggy agent code from interfering with Bloom. The tradeoff: some legitimate features (like OAuth flows relying on localStorage) won’t work in the preview. For those cases, users click “Open in New Tab” to see the service outside the sandbox.

URL Pattern: Subdomain Routing

Each service gets a unique subdomain:

{service-name}-{port}.colony.local

For example:

web-4001.colony.local
api-4002.colony.local

This lets us route traffic using Caddy’s dynamic configuration. When Colony starts a service on port 4001, it registers a Caddy route that proxies web-4001.colony.local to the service’s network namespace on port 4001.

The benefit: no port conflicts. Ten colonies can all run services on internal port 3000, accessible as web-3000-colonyA.colony.local, web-3000-colonyB.colony.local, etc.

The preview includes a floating toolbar:

Reload: Refreshes the iframe (useful when agents rebuild)
Open in Tab: Opens service in new browser tab (outside sandbox)
Copy URL: Copies the *.colony.local URL to clipboard

These are floating buttons overlaid on the iframe, positioned top-right. The toolbar is part of Bloom (not inside the iframe), so it’s always accessible even if the previewed service crashes.

Terminal Preview: xterm.js + Binary WebSocket

The terminal preview is more complex. We’re streaming live shell output from a PTY (pseudo-terminal) running inside the colony’s network namespace.

Architecture

PTY Session (Erlang): Each colony has an OTP actor (pty_session) that spawns a shell using Erlang’s open_port with PTY support.
Binary WebSocket: PTY output is binary (includes ANSI escape codes, control characters), so we use a binary WebSocket (not text/JSON).
xterm.js v5: Frontend renders the terminal. Configured with Nerd Font support for icons and ligatures.

Here’s the xterm.js setup:

import { Terminal } from '@xterm/xterm';
import { FitAddon } from '@xterm/addon-fit';
import { WebLinksAddon } from '@xterm/addon-web-links';

const terminal = new Terminal({
  fontFamily: "'FiraCode Nerd Font Mono', monospace",
  fontSize: 13,
  cursorBlink: true,
  theme: {
    background: '#1e1e1e',
    foreground: '#d4d4d4',
  },
});

const fitAddon = new FitAddon();
terminal.loadAddon(fitAddon);
terminal.loadAddon(new WebLinksAddon());

terminal.open(containerElement);
fitAddon.fit();

WebSocket Protocol

Simple:

Server → Client: Raw PTY output (binary frames)
Client → Server: User input (keypresses as binary)

When you type in the terminal, xterm.js captures the keypress and sends it over the WebSocket. The PTY session receives it, writes it to the shell, and the shell’s output comes back over the same WebSocket.

This creates a fully interactive terminal in the browser. You can run vim, htop, or any terminal UI, and it works like a native terminal.

Nerd Font Rendering

We use FiraCode Nerd Font Mono to support icons. Many CLI tools (exa, starship, powerlevel10k) emit Nerd Font icons. Without proper font support, these appear as boxes or question marks.

By bundling Nerd Font and configuring xterm.js to use it, icons render correctly. This makes the terminal preview feel native, not degraded.

Agent Panel: Streaming Events

The agent panel shows what the AI agent is doing in real time. Unlike logs (raw command output), agent events are structured updates about state:

{
  "type": "status",
  "status": "running",
  "message": "Installing dependencies..."
}

{
  "type": "command",
  "command": "npm install",
  "started_at": "2026-01-08T14:32:10Z"
}

{
  "type": "output",
  "stream": "stdout",
  "data": "added 342 packages in 8.2s"
}

{
  "type": "status",
  "status": "idle",
  "message": "Waiting for next task"
}

These stream over a dedicated WebSocket. The frontend (SolidJS) subscribes and updates reactively:

const [events, setEvents] = createStore<AgentEvent[]>([]);

websocket.onmessage = (msg) => {
  const event = JSON.parse(msg.data);
  setEvents([...events, event]);
};

return (
  <div class="space-y-2">
    <For each={events}>
      {(event) => (
        <div class={`event event-${event.type}`}>
          {event.message}
        </div>
      )}
    </For>
  </div>
);

This gives users a high-level view of agent activity without drowning them in raw logs. If they need details, they switch to the terminal preview and see full output.

Plugin Architecture: Dynamic Registry

The preview system is extensible. New preview types (database admin UI, API docs viewer, etc.) can be added without modifying core code.

Preview Plugin Interface

A preview plugin implements:

export interface PreviewPlugin {
  id: string;
  name: string;
  icon: () => JSX.Element;
  match: (service: Service) => boolean;
  render: (props: PreviewProps) => JSX.Element;
}

Example web preview plugin:

export const webPreview: PreviewPlugin = {
  id: 'web',
  name: 'Web Preview',
  icon: () => <IconWorld />,
  match: (service) => service.type === 'web',
  render: (props) => <WebPreview url={props.url} />,
};

Registry

Plugins register in a global registry:

const previewRegistry = new Map<string, PreviewPlugin>();

export function registerPreviewPlugin(plugin: PreviewPlugin) {
  previewRegistry.set(plugin.id, plugin);
}

registerPreviewPlugin(webPreview);
registerPreviewPlugin(terminalPreview);
registerPreviewPlugin(agentPreview);

When Bloom renders a colony’s preview tabs:

Reads the colony.toml services
For each service, finds matching plugin via plugin.match(service)
Renders the plugin’s component

New preview types can be added by:

Implementing PreviewPlugin
Calling registerPreviewPlugin
No changes to core Bloom code

We’re planning plugins for:

Database browser (for colonies running Postgres/MySQL)
API docs (auto-generated from OpenAPI specs)
Test results (visual test runner output)
Metrics dashboard (Prometheus/Grafana-style charts)

Responsive Layout: @corvu/resizable

The preview system uses @corvu/resizable for split-pane layouts. Users resize preview panels by dragging dividers:

import { Resizable } from '@corvu/resizable';

<Resizable orientation="horizontal">
  <Resizable.Panel initialSize={0.6}>
    <WebPreview />
  </Resizable.Panel>
  <Resizable.Handle />
  <Resizable.Panel initialSize={0.4}>
    <TerminalPreview />
  </Resizable.Panel>
</Resizable>

This lets users customize layout:

Full-width web preview with hidden terminal
Split view with logs on right
Terminal-only view (hide web preview)

Layout state saves to localStorage, so it persists across sessions.

Real-Time Update Flow

Trace a complete flow: agent edits a React component, web preview updates.

Agent edits file: Agent executor writes to src/App.tsx in the colony’s workspace
Hot reload triggers: Dev server (Vite, Next.js) detects file change and rebuilds
Iframe updates: iframe’s src doesn’t change, but dev server pushes new code via its own WebSocket
Agent event: Agent executor emits {"type": "status", "message": "File edited: src/App.tsx"}
Agent panel updates: Bloom’s WebSocket receives event and displays in agent panel

The user sees:

Web preview refreshing automatically
Event in agent panel saying “File edited: src/App.tsx”
No manual reload required

This near-instant feedback loop is critical for understanding what agents are doing. Without it, users would blindly trust agents and only see results after the fact.

Performance Considerations

Streaming real-time data from multiple colonies is resource-intensive. We optimize:

Binary WebSockets for terminals. Text WebSockets would require base64 encoding PTY output. Binary frames are smaller and faster.

Event throttling. If an agent emits hundreds of events per second (progress bar printing every 10ms), we throttle to max 10 events/second for display. Full stream is still logged.

Lazy loading. Preview panels aren’t rendered until you switch to their tab. Keeps initial page load fast even with 10+ tabs.

Iframe sandboxing. Using sandbox without allow-same-origin prevents previewed services from slowing down the parent page with heavy JavaScript.

What’s Next

The preview system is functional but early. Upcoming features:

Multi-colony grid view: See 4 colonies’ web previews simultaneously (2x2 grid)
Replay mode: Scrub through a timeline of agent actions and see UI state at each step
Screenshot diffing: Auto-capture screenshots before/after agent changes and show visual diffs
Collaborative cursors: Multiple users watching the same colony see each other’s mouse cursors (like Figma)

We’re also exploring agent-driven previews: letting agents programmatically open preview tabs, take screenshots, and assert that UI elements appear correctly. This would enable agents to test their own work visually, not just with unit tests.

The Black Box Is Opened

The preview system turns agent development from a black box into a transparent process. Users see what agents see, in the same formats (web UIs, terminals, structured logs). This builds trust and makes debugging easier.

When an agent breaks something, you don’t just see the error in a log — you see the broken UI in the web preview, the failed command in the terminal, and the agent’s status (“Retrying build…”) in the agent panel. It’s a complete observability story.

Want to watch AI agents work in real time? Join the waitlist for Colony.