CAO Adapter

Integrate CLI Agent Orchestrator (CAO) by AWS Labs as a first-class Binex adapter. CAO manages multi-agent systems through terminal sessions (tmux). Binex wraps CAO execution with full observability: trace, debug, run diff, replay, and cost/time tracking.

v1 scope: Handoff pattern only (synchronous execution). Assign and Send Message patterns are deferred.

Quick Start

Prerequisites

Install and start the CAO server:

pip install cli-agent-orchestrator
cao-server start  # starts on http://localhost:9889

Install at least one agent profile in your agent store:

# profiles are .md files in ~/.aws/cli-agent-orchestrator/agent-store/
ls ~/.aws/cli-agent-orchestrator/agent-store/

Minimal Workflow

name: cao-demo
nodes:
  review:
    agent: "cao://code_supervisor"
    outputs: [result]

Run it:

binex run cao-demo.yaml

YAML Configuration

Minimal (all defaults)

nodes:
  my_node:
    agent: "cao://profile_name"
    outputs: [result]

Full Configuration

nodes:
  my_node:
    agent: "cao://profile_name"
    cao:
      mode: handoff               # only "handoff" is supported
      provider: claude_code       # optional — CLI provider hint
      output_format: json         # auto (default) | json | text
      output_field: "$.result"    # JSONPath — only valid with output_format: json
      timeout_minutes: 60         # integer >= 1, default 60
    outputs: [result]
    depends_on: [previous_node]

Multi-Node Workflow with CAO

name: research-pipeline
nodes:
  gather:
    agent: "cao://research_agent"
    cao:
      output_format: json
      output_field: "$.findings"
      timeout_minutes: 30
    outputs: [findings]

  review:
    agent: "cao://code_supervisor"
    depends_on: [gather]
    outputs: [review_result]

Configuration Reference

`CaoConfig` Fields

CaoConfig is a nested block under cao: on a node (similar to LoopSpec on loop:):

Field	Type	Default	Description
`mode`	`handoff`	`handoff`	CAO orchestration pattern. Only `handoff` is supported — use Binex DAG parallelism for fan-out/fan-in patterns instead of CAO assign.
`provider`	string or null	`null`	CLI provider hint passed to the CAO server. When omitted, defaults to `claude_code` at runtime. See Providers below.
`output_format`	`auto` / `json` / `text`	`auto`	How to parse agent stdout. `auto` tries JSON first, falls back to text. `json` requires valid JSON (fails otherwise). `text` returns raw stdout.
`output_field`	string or null	`null`	JSONPath expression (must start with `$.`) to extract a specific field from JSON output. Only valid when `output_format` is `json`.
`timeout_minutes`	integer	`60`	Maximum execution time in minutes before `CAOTimeoutError`. Must be >= 1.
`max_human_prompts`	integer	`3`	Maximum number of human-in-the-loop prompts per node execution. Prevents infinite prompt loops when agent repeatedly asks for input.

Validation Rules

output_field requires output_format: json — raises ValueError at load time
output_field must start with $. — raises ValueError at load time
timeout_minutes must be >= 1 — raises ValueError at load time
mode only accepts handoff — invalid modes rejected by Pydantic

Providers

CAO supports 3 CLI providers:

Provider	Value	Description
Claude Code	`claude_code`	Anthropic's CLI agent
Kiro CLI	`kiro_cli`	AWS Kiro CLI
Q CLI	`q_cli`	Amazon Q Developer CLI

When provider is omitted, the CAO server uses whatever provider is configured in the agent profile.

Output Formats

`auto` (default)

The adapter attempts to parse stdout as JSON. If successful, the artifact type is json and the content is the parsed dict. If parsing fails, the content is stored as a raw text string.

`json` (strict)

Requires valid JSON output. If parsing fails, raises CAOOutputParseError. When combined with output_field, uses JSONPath to extract a specific value.

`text` (raw)

Returns the raw stdout string without any parsing attempt.

Artifacts

Each CAO execution produces two artifacts:

Artifact	Type	Description
`{node_id}_cao_raw`	`cao_raw_output`	Complete raw terminal stdout (for debugging)
`{node_id}_cao_output`	`cao_output` or `json`	Parsed output passed to downstream nodes

Only cao_output is forwarded to dependent nodes. Both are visible in the debug view.

Error Types

Error	Cause	Debug View Shows
`CAOServerUnavailableError`	CAO server not reachable	Server URL + instructions to start it
`CAOProfileNotFoundError`	Agent profile `.md` not found in agent store	Profile name + install command
`CAOTimeoutError`	Execution exceeded `timeout_minutes`	Elapsed time, timeout threshold, raw output captured so far
`CAOAgentError`	CAO terminal reported `error` status	Raw terminal output
`CAOOutputParseError`	Failed to parse output as JSON (when `output_format: json`)	Raw output + parse error message

Environment Variables

Variable	Default	Description
`BINEX_CAO_SERVER_URL`	`http://localhost:9889`	Base URL of the CAO REST server
`BINEX_CAO_AGENT_STORE`	`~/.aws/cli-agent-orchestrator/agent-store`	Directory containing agent profile `.md` files

Session Naming

Each workflow run gets a single CAO session named binex-{run_id}. The CAO server prefixes this with cao-, so the actual tmux session name is cao-binex-{run_id}.

One session per workflow run
Multiple terminals within that session (one per CAO node)
Session name is deterministic from run_id for crash recovery

Shared Sessions

When a workflow has multiple CAO nodes, they share a single session:

The first CAO node to execute creates the session via POST /sessions
Subsequent CAO nodes in the same run add terminals to the existing session
Coordination is handled by CAOAdapter._run_sessions: ClassVar[dict] — a class-level dict mapping run_id to session_name

nodes:
  gather:
    agent: "cao://research_agent"    # creates session binex-{run_id}
    outputs: [findings]

  review:
    agent: "cao://code_supervisor"   # reuses same session
    depends_on: [gather]
    outputs: [review_result]

Human-in-the-Loop

When a CAO agent reaches waiting_user_answer status, Binex prompts the user for input:

CLI: falls back to click.prompt() for interactive input
Web UI: emits a cao:waiting_input SSE event, which opens CaoInputModal in the browser
Limit: max_human_prompts (default 3) per node execution — prevents infinite prompt loops
API: POST /api/v1/cao/terminals/{id}/input forwards the user's response to the CAO terminal

After receiving input, the adapter resumes polling until the agent completes or asks again (up to the limit).

Cleanup Lifecycle

Terminal and session cleanup follows a predictable pattern:

Happy path: POST /terminals/{id}/exit — gracefully exits the terminal
Error path: POST /terminals/{id}/exit + DELETE /terminals/{id} — exit then force-delete
Workflow end: close() exits any remaining terminals, then DELETE /sessions/cao-{name} removes the session entirely and cleans _run_sessions

Session Registry

Binex tracks active CAO terminal sessions in a cao_sessions SQLite table. This enables:

Crash recovery: If Binex exits unexpectedly, orphaned sessions are detected on next startup and shown as a dashboard banner with "Clean up" action.
Graceful shutdown: On Ctrl+C, all active CAO terminals are terminated automatically.

Cost Tracking

CAO providers are subscription-based — per-token cost is not available. Cost views show $0.000 (subscription-based) with elapsed time as the primary resource indicator.

Known Limitations

Output capped at 200 lines: CAO uses tmux with TMUX_HISTORY_LINES=200. Very long agent outputs may be truncated in cao_raw_output.
Handoff only: CAO's assign and send_message are not exposed — use Binex DAG parallelism instead (see Parallel Workers below).
No authentication: CAO server access is unauthenticated in v1.
No nested CAO: A CAO node cannot invoke another CAO node.
Local server only: Binex does not start or manage the cao-server process.

Parallel Workers (Recommended Pattern)

Instead of CAO's assign pattern, use Binex DAG parallelism for supervisor-worker workflows:

name: cao-parallel-workers
nodes:
  supervisor:
    agent: cao://code_supervisor
    system_prompt: "Break this into subtasks and output JSON with task_a and task_b fields"
    cao:
      output_format: json
    outputs: [tasks]

  worker_a:
    agent: cao://developer
    depends_on: [supervisor]
    inputs:
      task: ${supervisor.tasks}
    cao:
      provider: claude_code
      timeout_minutes: 30
    outputs: [result_a]

  worker_b:
    agent: cao://reviewer
    depends_on: [supervisor]
    inputs:
      task: ${supervisor.tasks}
    cao:
      provider: q_cli
      timeout_minutes: 30
    outputs: [result_b]

  collector:
    agent: llm://gpt-4o
    depends_on: [worker_a, worker_b]
    inputs:
      a: ${worker_a.result_a}
      b: ${worker_b.result_b}
    system_prompt: "Combine the results into a final summary"
    outputs: [summary]

worker_a and worker_b execute concurrently (both depend only on supervisor). The collector waits for both to complete. This gives you the same fan-out/fan-in behavior as CAO's assign — with full observability, error handling, cost tracking, and retry support.