Agentra LabsAgentra Labs DocsPublic Documentation

AgenticVision

API Reference

AgenticVision exposes its capabilities through MCP tools, resources, and prompts. Run agentic-vision-mcp info to verify tool discovery.

AgenticVision exposes its capabilities through MCP tools, resources, and prompts. Run agentic-vision-mcp info to verify tool discovery.

MCP Tools

vision_capture

Capture an image and store it in visual memory.

ParameterTypeRequiredDefaultDescription
sourceobjectyesImage source: type is file, base64, screenshot, or clipboard. Include path for file, data+mime for base64, optional region for screenshot.
descriptionstringnoHuman-readable label for the capture.
labelsstring[]no[]Tags for filtering and organization.
extract_ocrbooleannofalseRun OCR on the captured image.

Returns: { capture_id, timestamp, dimensions, quality_score }

vision_query

Search captures by filters.

ParameterTypeRequiredDefaultDescription
description_containsstringnoSubstring match on capture descriptions.
labelsstring[]noFilter by label tags.
min_qualitynumbernoMinimum quality score (0.0-1.0).
sort_bystringno"recent"Sort order: recent or quality.
max_resultsnumberno20Maximum captures to return.
beforenumbernoUnix timestamp upper bound.
afternumbernoUnix timestamp lower bound.
session_idsnumber[]noFilter by session.

Returns: array of capture metadata objects.

vision_similar

Find visually similar captures using CLIP embedding distance.

ParameterTypeRequiredDefaultDescription
capture_idnumbernoFind captures similar to this one.
embeddingnumber[]noOr provide a raw embedding vector.
top_knumberno10Number of results.
min_similaritynumberno0.5Minimum cosine similarity threshold.
event_typesstring[]noFilter by event type.

Returns: array of { capture_id, similarity_score, metadata }.

vision_compare

Compare two captures for visual similarity.

ParameterTypeRequiredDefaultDescription
id_anumberyesFirst capture ID.
id_bnumberyesSecond capture ID.
detailedbooleannofalseInclude detailed diff data.

Returns: { similarity_score, dimensions_match, summary }.

vision_diff

Pixel-level diff between two captures.

ParameterTypeRequiredDefaultDescription
id_anumberyesFirst capture ID.
id_bnumberyesSecond capture ID.

Returns: { changed_pixel_count, total_pixels, change_percentage, bounding_boxes }.

vision_ocr

Extract text from a capture using OCR.

ParameterTypeRequiredDefaultDescription
capture_idnumberyesCapture to extract text from.
languagestringno"eng"OCR language code.

Returns: { text, confidence, regions }.

vision_track

Configure tracking for a UI region. Captures must be triggered externally.

ParameterTypeRequiredDefaultDescription
regionobjectyes{ x, y, w, h } in pixels.
interval_msnumberno1000Minimum interval between captures.
max_capturesnumberno100Stop after this many captures.
on_change_thresholdnumberno0.95Similarity threshold; below this counts as a change.

Returns: { track_id, region, status }.

Link a visual capture to an AgenticMemory node.

ParameterTypeRequiredDefaultDescription
capture_idnumberyesCapture to link.
memory_node_idnumberyesTarget memory node ID.
relationshipstringno"observed_during"One of: observed_during, evidence_for, screenshot_of.

Returns: { link_id, capture_id, memory_node_id, relationship }.

vision_health

Evaluate visual memory reliability.

ParameterTypeRequiredDefaultDescription
low_quality_thresholdnumberno0.45Below this is flagged as low quality.
stale_after_hoursnumberno168Hours before a capture is considered stale.
max_examplesnumberno20Max examples per category.

Returns: { total_captures, low_quality, stale, unlinked, unlabeled }.

session_start

Start a new vision session.

ParameterTypeRequiredDefaultDescription
session_idnumbernoautoExplicit session ID.

Returns: { session_id }.

session_end

End the current vision session and persist.

Returns: { session_id, capture_count }.

Quality score

Every capture receives a quality score (0.0-1.0) computed from:

  • Resolution: higher resolution scores higher.
  • Embedding confidence: CLIP embedding norm.
  • Metadata completeness: presence of description and labels.
  • OCR yield: text extraction success rate (if OCR was requested).

CLI subcommands

agentic-vision-mcp serve       # Start MCP server
agentic-vision-mcp validate    # Validate artifact path
agentic-vision-mcp info        # List available tools and capabilities
agentic-vision-mcp completions # Shell completions
agentic-vision-mcp repl        # Interactive REPL

See also