Models

Browse ready-made models for Argmax Pro SDK.

Please refer to OpenBench for open-source and reproducible latency and accuracy competitive benchmarks

0 – 2000 MB
OP

openai/whisper-large-v3-turbo

The most recent iteration of Whisper

626 MB
Prerecorded Transcription
Real-time Transcription
Language Detection
multilingual
en
zh
de
es
ru
ko
fr
ja
+92 more
NV

nvidia/parakeet-v2

A frontier model that surpasses OpenAI Whisper Large V3 Turbo on English speech-to-text accuracy while being ~9x faster

476 MB
Real-time Transcription
Prerecorded Transcription
en
NV

nvidia/parakeet-v3

The most recent iteration of Parakeet. Same size and speed as Parakeet V2, but supports 24 more languages.

494 MB
Real-time Transcription
Prerecorded Transcription
en
de
es
fr
nl
it
da
et
+17 more
PY

pyannote/flagship

A frontier model for speaker diarization ("who spoke when") with state-of-the-art accuracy (DER) on 13 datasets on OpenBench.

90 MB
Speaker Diarization
language-agnostic
PY

pyannote/open-source

An open-source model for speaker diarization ("who spoke when") with second-best accuracy (DER) on 13 datasets on OpenBench.

15 MB
Speaker Diarization
language-agnostic