The meta use case. An AI agent uses Agent Vision to control another AI agent's terminal. One Claude Code instance reads the screen of another, types commands, approves prompts, and reviews output. This isn't theoretical — we literally built this website using this exact setup. An outer Claude controlled an inner Claude through Agent Vision, watching it code, giving feedback, and iterating on the design.
AI agents are isolated. Each one runs in its own terminal, its own context window, its own world. If you want one agent to check another agent's work, you copy-paste output between them manually. Orchestrating multiple AI agents means you are the bottleneck — reading one agent's output, deciding what to tell the next one, shuttling context back and forth.
One AI agent screenshots the other's terminal, reads its output, and decides what to do next. It can type commands into the inner agent's session, approve or reject its suggestions, and verify the results visually. The outer agent becomes a supervisor with full visual access to the inner agent's work. No copy-paste, no manual context shuttling.
Start a session targeting another terminal window
Lock onto the terminal window where the other Claude Code instance is running.
Capture the inner agent's current output
Screenshot what the other agent is showing. The outer agent can now read and analyze it.
Read the inner agent's terminal text
Discover text elements, buttons, and prompts in the inner agent's interface.
Type a command into the inner agent
Send instructions to the inner Claude Code instance by typing into its prompt.
Approve a prompt or suggestion
Click approve/accept buttons on the inner agent's permission prompts.
Re-capture to verify the result
Check what the inner agent produced. Loop until the result meets your criteria.
Requires macOS 13+ · No dependencies · ~4MB
← Back to agentvision.robinvanbaalen.nl