CAO Adapter
Integrate CLI Agent Orchestrator (CAO) by AWS Labs as a first-class Binex adapter. CAO manages multi-agent systems through terminal sessions (tmux). Binex wraps CAO execution with full observability: trace, debug, run diff, replay, and cost/time tracking.
v1 scope: Handoff pattern only (synchronous execution). Assign and Send Message patterns are deferred.
Quick Start
Prerequisites
-
Install and start the CAO server:
pip install cli-agent-orchestrator cao-server start # starts on http://localhost:9889 -
Install at least one agent profile in your agent store:
# profiles are .md files in ~/.aws/cli-agent-orchestrator/agent-store/ ls ~/.aws/cli-agent-orchestrator/agent-store/
Minimal Workflow
name: cao-demo
nodes:
review:
agent: "cao://code_supervisor"
outputs: [result]
Run it:
binex run cao-demo.yaml
YAML Configuration
Minimal (all defaults)
nodes:
my_node:
agent: "cao://profile_name"
outputs: [result]
Full Configuration
nodes:
my_node:
agent: "cao://profile_name"
cao:
mode: handoff # only "handoff" is supported
provider: claude_code # optional — CLI provider hint
output_format: json # auto (default) | json | text
output_field: "$.result" # JSONPath — only valid with output_format: json
timeout_minutes: 60 # integer >= 1, default 60
outputs: [result]
depends_on: [previous_node]
Multi-Node Workflow with CAO
name: research-pipeline
nodes:
gather:
agent: "cao://research_agent"
cao:
output_format: json
output_field: "$.findings"
timeout_minutes: 30
outputs: [findings]
review:
agent: "cao://code_supervisor"
depends_on: [gather]
outputs: [review_result]
Configuration Reference
CaoConfig Fields
CaoConfig is a nested block under cao: on a node (similar to LoopSpec on loop:):
| Field | Type | Default | Description |
|---|---|---|---|
mode |
handoff |
handoff |
CAO orchestration pattern. Only handoff is supported — use Binex DAG parallelism for fan-out/fan-in patterns instead of CAO assign. |
provider |
string or null | null |
CLI provider hint passed to the CAO server. When omitted, defaults to claude_code at runtime. See Providers below. |
output_format |
auto / json / text |
auto |
How to parse agent stdout. auto tries JSON first, falls back to text. json requires valid JSON (fails otherwise). text returns raw stdout. |
output_field |
string or null | null |
JSONPath expression (must start with $.) to extract a specific field from JSON output. Only valid when output_format is json. |
timeout_minutes |
integer | 60 |
Maximum execution time in minutes before CAOTimeoutError. Must be >= 1. |
max_human_prompts |
integer | 3 |
Maximum number of human-in-the-loop prompts per node execution. Prevents infinite prompt loops when agent repeatedly asks for input. |
Validation Rules
output_fieldrequiresoutput_format: json— raisesValueErrorat load timeoutput_fieldmust start with$.— raisesValueErrorat load timetimeout_minutesmust be >= 1 — raisesValueErrorat load timemodeonly acceptshandoff— invalid modes rejected by Pydantic
Providers
CAO supports 3 CLI providers:
| Provider | Value | Description |
|---|---|---|
| Claude Code | claude_code |
Anthropic's CLI agent |
| Kiro CLI | kiro_cli |
AWS Kiro CLI |
| Q CLI | q_cli |
Amazon Q Developer CLI |
When provider is omitted, the CAO server uses whatever provider is configured in the agent profile.
Output Formats
auto (default)
The adapter attempts to parse stdout as JSON. If successful, the artifact type is json and the content is the parsed dict. If parsing fails, the content is stored as a raw text string.
json (strict)
Requires valid JSON output. If parsing fails, raises CAOOutputParseError. When combined with output_field, uses JSONPath to extract a specific value.
text (raw)
Returns the raw stdout string without any parsing attempt.
Artifacts
Each CAO execution produces two artifacts:
| Artifact | Type | Description |
|---|---|---|
{node_id}_cao_raw |
cao_raw_output |
Complete raw terminal stdout (for debugging) |
{node_id}_cao_output |
cao_output or json |
Parsed output passed to downstream nodes |
Only cao_output is forwarded to dependent nodes. Both are visible in the debug view.
Error Types
| Error | Cause | Debug View Shows |
|---|---|---|
CAOServerUnavailableError |
CAO server not reachable | Server URL + instructions to start it |
CAOProfileNotFoundError |
Agent profile .md not found in agent store |
Profile name + install command |
CAOTimeoutError |
Execution exceeded timeout_minutes |
Elapsed time, timeout threshold, raw output captured so far |
CAOAgentError |
CAO terminal reported error status |
Raw terminal output |
CAOOutputParseError |
Failed to parse output as JSON (when output_format: json) |
Raw output + parse error message |
Environment Variables
| Variable | Default | Description |
|---|---|---|
BINEX_CAO_SERVER_URL |
http://localhost:9889 |
Base URL of the CAO REST server |
BINEX_CAO_AGENT_STORE |
~/.aws/cli-agent-orchestrator/agent-store |
Directory containing agent profile .md files |
Session Naming
Each workflow run gets a single CAO session named binex-{run_id}. The CAO server prefixes this with cao-, so the actual tmux session name is cao-binex-{run_id}.
- One session per workflow run
- Multiple terminals within that session (one per CAO node)
- Session name is deterministic from
run_idfor crash recovery
Shared Sessions
When a workflow has multiple CAO nodes, they share a single session:
- The first CAO node to execute creates the session via
POST /sessions - Subsequent CAO nodes in the same run add terminals to the existing session
- Coordination is handled by
CAOAdapter._run_sessions: ClassVar[dict]— a class-level dict mappingrun_idtosession_name
nodes:
gather:
agent: "cao://research_agent" # creates session binex-{run_id}
outputs: [findings]
review:
agent: "cao://code_supervisor" # reuses same session
depends_on: [gather]
outputs: [review_result]
Human-in-the-Loop
When a CAO agent reaches waiting_user_answer status, Binex prompts the user for input:
- CLI: falls back to
click.prompt()for interactive input - Web UI: emits a
cao:waiting_inputSSE event, which opensCaoInputModalin the browser - Limit:
max_human_prompts(default 3) per node execution — prevents infinite prompt loops - API:
POST /api/v1/cao/terminals/{id}/inputforwards the user's response to the CAO terminal
After receiving input, the adapter resumes polling until the agent completes or asks again (up to the limit).
Cleanup Lifecycle
Terminal and session cleanup follows a predictable pattern:
- Happy path:
POST /terminals/{id}/exit— gracefully exits the terminal - Error path:
POST /terminals/{id}/exit+DELETE /terminals/{id}— exit then force-delete - Workflow end:
close()exits any remaining terminals, thenDELETE /sessions/cao-{name}removes the session entirely and cleans_run_sessions
Session Registry
Binex tracks active CAO terminal sessions in a cao_sessions SQLite table. This enables:
- Crash recovery: If Binex exits unexpectedly, orphaned sessions are detected on next startup and shown as a dashboard banner with "Clean up" action.
- Graceful shutdown: On Ctrl+C, all active CAO terminals are terminated automatically.
Cost Tracking
CAO providers are subscription-based — per-token cost is not available. Cost views show $0.000 (subscription-based) with elapsed time as the primary resource indicator.
Known Limitations
- Output capped at 200 lines: CAO uses tmux with
TMUX_HISTORY_LINES=200. Very long agent outputs may be truncated incao_raw_output. - Handoff only: CAO's
assignandsend_messageare not exposed — use Binex DAG parallelism instead (see Parallel Workers below). - No authentication: CAO server access is unauthenticated in v1.
- No nested CAO: A CAO node cannot invoke another CAO node.
- Local server only: Binex does not start or manage the
cao-serverprocess.
Parallel Workers (Recommended Pattern)
Instead of CAO's assign pattern, use Binex DAG parallelism for supervisor-worker workflows:
name: cao-parallel-workers
nodes:
supervisor:
agent: cao://code_supervisor
system_prompt: "Break this into subtasks and output JSON with task_a and task_b fields"
cao:
output_format: json
outputs: [tasks]
worker_a:
agent: cao://developer
depends_on: [supervisor]
inputs:
task: ${supervisor.tasks}
cao:
provider: claude_code
timeout_minutes: 30
outputs: [result_a]
worker_b:
agent: cao://reviewer
depends_on: [supervisor]
inputs:
task: ${supervisor.tasks}
cao:
provider: q_cli
timeout_minutes: 30
outputs: [result_b]
collector:
agent: llm://gpt-4o
depends_on: [worker_a, worker_b]
inputs:
a: ${worker_a.result_a}
b: ${worker_b.result_b}
system_prompt: "Combine the results into a final summary"
outputs: [summary]
worker_a and worker_b execute concurrently (both depend only on supervisor). The collector waits for both to complete. This gives you the same fan-out/fan-in behavior as CAO's assign — with full observability, error handling, cost tracking, and retry support.