Test iOS apps in the Simulator

Testing iOS apps usually means Appium, XCUITest, or 30 minutes of setup before you can run a single test. Agent Vision skips all of that. Select the Simulator window, start scanning, and interact. The drag command handles touch gestures like swipe and scroll. Screenshots verify visual state. Element discovery finds buttons and labels. It's the iOS testing tool that doesn't require you to be an iOS testing expert.

Without Agent Vision

Appium requires a Java environment, WebDriverAgent, and careful version matching between Xcode, Simulator, and the Appium server. XCUITest requires writing Swift test code and running it through Xcode. Both approaches have steep setup costs, flaky device connections, and slow test execution. Quick exploratory testing of a new build means firing up an entire test infrastructure.

With Agent Vision

Start a session pointing at the Simulator window. Agent Vision discovers every button, label, text field, and switch through macOS Accessibility. Tap by clicking, swipe by dragging, type by targeting input fields. Re-capture after every action to verify the result. Your AI agent can test an iOS app flow in minutes without installing any test framework.

Commands

How it works

Target the Simulator window

terminal
$ agent-vision start --region 0,0,430,932 --name simulator

Lock onto the iOS Simulator window. Size matches a standard iPhone frame.

Discover on-screen elements

terminal
$ agent-vision elements --session $SID

Find all buttons, labels, text fields, switches, and navigation elements in the Simulator.

Tap a button

terminal
$ agent-vision click --element el-btn-login --session $SID

Tap is just a click. Agent Vision translates screen coordinates to the Simulator window.

Swipe to scroll

terminal
$ agent-vision drag --from 215,700 --to 215,300 --session $SID

The drag command handles swipe gestures. Drag up to scroll down, drag left to go to the next page.

Type into a text field

terminal
$ agent-vision type --element el-input-email --text "test@example.com" --session $SID

Enter text into input fields. Works with the Simulator's keyboard input.

Verify the visual state

terminal
$ agent-vision capture --session $SID --tag after-login

Screenshot the result and have your AI agent verify the expected screen appeared.

Real scenario

Example: Testing a login-to-dashboard flow

workflow
01
Agent targets the Simulator window running the iOS app
02
Agent discovers the email field, password field, and "Sign In" button on the login screen
03
Agent types test credentials into both fields
04
Agent taps "Sign In" and waits briefly for the transition
05
Agent re-captures and verifies the dashboard screen loaded (checks for welcome message, nav tabs)
06
Agent swipes down to verify the content list loads and scrolls correctly

Try it now

$ brew tap rvanbaalen/agent-vision
$ brew install agent-vision

Requires macOS 13+ · No dependencies · ~4MB

← Back to agentvision.robinvanbaalen.nl