Skip to main content

Choosing the Right Model

Depending on your specific workflow, usecase and hardware setup, you might want to choose a specific model other than the default. Some suggestions are listed below, or feel free to explore from all the available models.

Fastest

V2 Medium: Quick analysis with good accuracy, 256x256 resolution. Great for most workflows.

Fast

V2 XLarge: Larger model with higher accuracy, 256x256 resolution.

Accurate

V1 Multilingual high-res: Top-tier accuracy with 384×384 resolution. Benefits from more detailed and verbose searches, great multilingual support.

Most Accurate

V2 Multilingual x-high-res: Highest accuracy with 512x512 resolution. Benefits from more detailed and verbose searches, great multilingual support.

Visual models

This is a more comprehensive list of the models and their properties.
Model Name (Display)ResolutionSize on DiskInformation
V2 Medium (Default)256×256~1.5GBDefault, bundled in app installer
Balanced accuracy vs. performance, improved semantic understanding from V2 model improvements. Works well on most modern computers.
V2 Multilingual x-high-res512x512~2GBNEW, MOST ACCURATE MODEL 💪 Fantastic search result quality in both non-English and English languages. Benefits from more detailed and “verbose” searches - e.g. “a metal sign saying XYZ” instead of just “XYZ”
V1 Multilingual high-res384x384~4GBVery accurate search result quality in both non-English and English languages. Benefits from more detailed and “verbose” searches - e.g. “a metal sign saying XYZ” instead of just “XYZ”
V2 Medium high-res384×384~1.5GBIdentical parameter count as V2 Medium but will analyze frames at 384×384 for more image detail. Good if you need slightly finer detail than 256×256, but be aware it requires more RAM/VRAM.
V2 Medium x-high-res512×512~1.5GBSame parameter count as V2 Medium, but even higher resolution (512×512). Ideal for text detection/OCR or detailed reverse image searches. Uses significantly more memory and produces larger analysis files.
V2 Large256×256~3.3GBLarger model with higher accuracy in challenging scenarios.
V2 Large high-res384×384~3.3GBSame as V2 Large but with higher resolution (384×384). Ideal for text detection/OCR or detailed reverse image searches.
V2 Large x-high-res512×512~3.3GBSame parameters as V2 Large, but at 512×512 input resolution. Ideal for text detection/OCR or detailed reverse image searches.
V2 XLarge256×256~4.5GBEven larger V2 model. Offers improved recognition of subtle elements. Ideal if you want top-tier accuracy and have the computer to run it.
V2 XLarge high-res384×384~4.5GBA higher-resolution variant of V2 XLarge.
V2 XLarge x-high-res512×512~4.5GBThe heaviest V2 model in terms of resolution and parameter requirements. Even more accurate, even more resource demanding.
Medium256×256~812MBLegacy previous default model (V1), previously bundled with the application. Reasonably accurate but outperformed by V2 Medium in most scenarios. If your hardware can handle V2 Medium, prefer that instead.
Medium x-high-res512×512~812MBSame as V1 Medium but at 512×512. Useful for text detection, reverse image searches, etc. Produces larger analysis files than some bigger V1 models, purely due to the higher frame resolution.
Large multilingual256×256~1.48GBV1 Multilingual version that improves accuracy for non-English text. V2 equivalents are multilingual by default.
Large256×256~2.61GBLarger V1 model. Prefer V2 alternatives.
Large high-res384×384~2.61GBSame as V1 Large but higher resolution. Demands more resources.
XLarge high-res384×384~3.51GBLargest V1 model. High accuracy, but overshadowed by the new V2 XLarge. Recommended only for legacy compatibility if you can’t run V2.
XLarge multilingual256×256~4.51GBLargest V1 multilingual model. Very high accuracy for non-English searches - V2 models are multilingual by default, but V1 multilingual models can be slightly better than V2 models on certain languages.

Speech models

For transcriptions, Jumper uses Whisper models developed by OpenAI. The exact model depends on your platform.
PlatformModel VariantSize on DiskNotes
Windows / Intel Macwhisper-large-v3-turbo~1.62GBBundled with Windows and Intel Mac installers.
Apple M-series Macswhisper-large-v3-turbo~467MBUses a quantized version converted to Apple’s MLX framework for hardware acceleration (faster analysis, smaller model size). Bundled with Apple M-series installer.