Skip to main content

Overview

Visual search is Jumper’s core capability for finding specific visual content in your footage. Unlike traditional keyword-based search, visual search uses advanced machine learning models (specifically Visual Search Models) to understand the semantic meaning of what you’re looking for and match it to frames in your media. When you perform a visual search, Jumper analyzes your query (whether text or an image) and compares it against the visual understanding it has already extracted from your analyzed footage. This allows you to find content based on what things look like or what’s happening in the scene, not just metadata or filenames.
Visual search doesn’t use tags or metadata. Jumper analyzes the actual visual content of your footage frame by frame, so you can search for anything visible in your media without needing to tag or label it first.
Text-based visual search lets you describe what you’re looking for using natural language. Jumper’s visual search models understand the semantic meaning of your query and find frames that match that meaning, even when the exact words don’t appear in the footage.

How it works

When you search for something like “a person walking through a door,” the model understands:
  • What “a person” looks like visually
  • What “walking” means in terms of motion and pose
  • What “a door” is and how it appears in different contexts
  • How these elements relate to each other in a scene
The model doesn’t just match keywords. It understands concepts, relationships, and context. This means you can search using conversational, descriptive language and still get accurate results.

Language support

Text-based search works best in English, but if you’ve analyzed your footage with a multilingual model, Jumper can understand search queries in approximately 80 languages. The model recognizes the same visual concepts regardless of the language you use to describe them. For example, searching for white cat, gato blanco, or 白い猫 will likely return the same results because the model understands these are all descriptions of the same visual concept.
SCR-20250511-jesb.avif
Image-based search works differently from text search. Instead of describing what you’re looking for with words, you provide an image, and Jumper finds scenes in your footage that are visually similar to that image.

How it works

When you search by image, Jumper’s models analyze the visual characteristics of your search image (colors, composition, objects, people, settings, and overall visual style) and find frames in your footage that share similar visual properties. This is particularly powerful for finding:
  • More footage of the same person (even from different angles or lighting)
  • Similar locations or settings
  • Shots with matching visual style or composition
  • Objects or scenes that look similar but might be described differently
SCR-20250511-jgwp.jpeg

Image sources

Jumper provides several ways to use images for search, each designed for different workflow scenarios:
Image SourceDescription
Match Source/Event MonitorUses the frame currently visible in your editing software’s Source Monitor (or Event Viewer in Final Cut Pro). Useful for finding similar shots elsewhere in your footage while reviewing a clip.
Match TimelineUses the frame currently visible in your Timeline Monitor (or Timeline Viewer in Final Cut Pro). Helps you find alternative takes or similar shots to what’s already in your timeline.
Match PasteboardUses any image you’ve copied to your system’s clipboard. Allows searching with screenshots, reference photos, images from the web, or frames captured from other applications.
Find SimilarUses a thumbnail from your existing search results. Lets you refine your search, finding more scenes that match a specific result. This enables an iterative workflow where each result can become the starting point for further search.

Search results

When Jumper performs a visual search, it returns results as scenes; segments of media files with a defined start time and end time. Each scene represents a continuous portion of a file where the search query matches the visual content. scenes-in-files.avif Scenes are automatically detected by analyzing where the visual content matches your query begins and ends. This means you get the complete segments where your search term is relevant, making it easy to add entire shots to your timeline rather than just single frames.
Last modified on January 28, 2026