Think of it as Zapier for your actual desktop, but with AI intelligence. Agent Vision sees and controls every app on your Mac — Mail, Calendar, Slack, Numbers, Preview, anything. Chain them into workflows: read email, create calendar event, post summary to Slack, log it in a spreadsheet. No app needs to support integrations because Agent Vision works at the screen level. Every window is just another interface to scan and interact with.
Desktop app automation on macOS means AppleScript, which is unreliable and limited. Most apps have partial or no AppleScript support. Shortcuts.app helps for simple tasks but can't handle complex multi-app workflows with decision logic. You end up doing multi-step desktop workflows manually: read the email, copy the date, switch to Calendar, create an event, switch to Slack, type a message. Dozens of app switches per workflow.
Your AI agent manages sessions for each app window. It reads content from Mail.app, extracts dates and details, switches to Calendar to create an event, composes a Slack message with the summary, and logs everything in a Numbers spreadsheet. Each app is just a window to capture, scan for elements, and interact with. The AI agent handles the logic, context, and decision-making.
Start sessions for each app
Create a named session for Mail.app. Repeat for Calendar, Slack, Numbers.
Read email content
Screenshot the email. Your AI agent reads the content from the image.
Create a calendar event
Switch to Calendar and click "New Event" to start creating an entry.
Fill in event details
Type the event title extracted from the email.
Post to Slack
Switch to Slack and type a summary message in the channel input.
Log in spreadsheet
Switch to Numbers and log the event date in the tracking spreadsheet.
Requires macOS 13+ · No dependencies · ~4MB
← Back to agentvision.robinvanbaalen.nl