How it works
Screenshots are messy. They contain app chrome, usernames, nav bars, comments, timestamps, and partial text. AI screenshot analysis tries to separate the primary thing you saved from everything around it.
For productivity apps, the goal is not just OCR. The useful output is a resource record: a GitHub repo, article, post, event, invoice, place, contact, todo, product, or travel booking with the right metadata attached.
In SnapAction, scanned screenshots are sent to a Convex-backed AI agent through OpenRouter. When a URL is not visible, the backend can use Serper search and verification to recover the canonical link.
Example: from repo screenshot to card
Input: Screenshot of a GitHub page, tweet, or article mentioning user/awesome-tool
Analysis:
- Ignore app chrome and incidental text
- Identify the primary resource
- Classify it as a GitHub repository
- Search and verify the canonical GitHub URL if needed
- Return a structured resource with title, URL, tags, and metadata
Result: A searchable card with an Open action instead of a dead screenshot.
How SnapAction uses it
SnapAction scans screenshots selected from Photos and uses its backend to extract typed resources. The output is saved locally with SwiftData and can power actions like Open URL, Get Directions, Call, Email, Add to Calendar, or Copy Booking Reference.
- PhotoKit finds recent screenshots and screenshot asset IDs
- The app sends selected screenshots to Convex for analysis
- OpenRouter handles visual reasoning and structured extraction
- Serper helps find canonical URLs for partial or missing links
- SwiftData stores resource cards on device for browsing and Rewind
Because scanned screenshots are sent for analysis, SnapAction should not be described as fully on-device or offline-only.