Agent Vision
Agent Vision
Agent Vision lets any AI model search your video library, retrieve video details, and see matched frames directly — without the agent needing to fetch URLs or manage storage.
How it works
When an agent calls search_videos, results include thumbnail_base64 — a base64-encoded JPEG of the matched frame. Vision-capable models can see this image in the same turn, enabling multi-step workflows:
- Agent searches for a moment in video
- Pureframe returns the matched frame as base64
- Agent reads the visual content and answers follow-up questions
No URL fetching, no extra round trips.
Available tools
MCP integration
The simplest integration. Add the Pureframe MCP server to your AI client config:
For remote MCP (no local install):
See the MCP guide for client-specific setup paths.
HTTP function calling
For OpenAI, Gemini, or any LLM with function calling — fetch the OpenAI-compatible schema and wire it directly:
Pass schema directly to client.chat.completions.create(tools=schema) for OpenAI or the equivalent for other providers.
Example: search + vision with Claude
Tool response fields
search_videos returns a list of clips. Each clip: