Agent Vision

Agent Vision

Agent Vision lets any AI model search your video library, retrieve video details, and see matched frames directly — without the agent needing to fetch URLs or manage storage.

How it works

When an agent calls search_videos, results include thumbnail_base64 — a base64-encoded JPEG of the matched frame. Vision-capable models can see this image in the same turn, enabling multi-step workflows:

  1. Agent searches for a moment in video
  2. Pureframe returns the matched frame as base64
  3. Agent reads the visual content and answers follow-up questions

No URL fetching, no extra round trips.

Available tools

ToolDescription
search_videosSearch for moments matching a text query. Returns clips with timestamps and embedded frames.
list_collectionsList all collections in the library.
get_collectionGet collection details: video count, total duration, storage.
get_videoGet video metadata and a presigned playback URL.

MCP integration

The simplest integration. Add the Pureframe MCP server to your AI client config:

1{
2 "mcpServers": {
3 "pureframe": {
4 "command": "npx",
5 "args": ["-y", "pureframe-mcp"],
6 "env": { "PUREFRAME_API_KEY": "pf_key_..." }
7 }
8 }
9}

For remote MCP (no local install):

1{
2 "mcpServers": {
3 "pureframe": {
4 "url": "https://mcp.pureframe.ai",
5 "headers": { "Authorization": "Bearer pf_key_..." }
6 }
7 }
8}

See the MCP guide for client-specific setup paths.

HTTP function calling

For OpenAI, Gemini, or any LLM with function calling — fetch the OpenAI-compatible schema and wire it directly:

1import httpx
2
3BASE = "https://api.pureframe.ai"
4HEADERS = {"Authorization": "Bearer pf_key_..."}
5
6# Get the tool schema (OpenAI format)
7schema = httpx.get(f"{BASE}/v1/agent/schema.json", headers=HEADERS).json()
8
9# Call a tool
10result = httpx.post(f"{BASE}/v1/agent/call", headers=HEADERS, json={
11 "tool": "search_videos",
12 "input": {
13 "query": "presenter pointing at a chart",
14 "collection_id": "col_abc123",
15 "limit": 5
16 }
17}).json()

Pass schema directly to client.chat.completions.create(tools=schema) for OpenAI or the equivalent for other providers.

Example: search + vision with Claude

1import anthropic, httpx, base64
2
3client = anthropic.Anthropic()
4pf = {"Authorization": "Bearer pf_key_..."}
5
6# Search for a moment
7clips = httpx.post("https://api.pureframe.ai/v1/agent/call",
8 headers=pf,
9 json={"tool": "search_videos", "input": {"query": "whiteboard diagram", "limit": 1}}
10).json()["data"]
11
12# Pass the matched frame directly to Claude
13frame_b64 = clips[0]["thumbnail_base64"]
14
15response = client.messages.create(
16 model="claude-opus-4-7",
17 max_tokens=1024,
18 messages=[{
19 "role": "user",
20 "content": [
21 {
22 "type": "image",
23 "source": {"type": "base64", "media_type": "image/jpeg", "data": frame_b64}
24 },
25 {"type": "text", "text": "What's written on the whiteboard?"}
26 ]
27 }]
28)
29print(response.content[0].text)

Tool response fields

search_videos returns a list of clips. Each clip:

FieldDescription
video_idID of the source video
filenameOriginal filename
start_secs / end_secsClip boundaries in seconds
relevance_scoreRelevance from 0 to 1
text_snippetTranscribed speech in this clip, if available
clip_urlPresigned URL to stream the video (valid ~1 hour)
thumbnail_urlPresigned URL to the matched frame
thumbnail_base64Base64 JPEG for direct use with vision models