Models
Browse ready-made models for Argmax Pro SDK.
Please refer to OpenBench for open-source and reproducible latency and accuracy competitive benchmarks
0 – 2000 MB
OP
openai/whisper-large-v3-turbo
The most recent iteration of Whisper
626 MB
Prerecorded Transcription
Real-time Transcription
Language Detection
multilingual
en
zh
de
es
ru
ko
fr
ja
+92 more
NV
nvidia/parakeet-v2
A frontier model that surpasses OpenAI Whisper Large V3 Turbo on English speech-to-text accuracy while being ~9x faster
476 MB
Real-time Transcription
Prerecorded Transcription
en
NV
nvidia/parakeet-v3
The most recent iteration of Parakeet. Same size and speed as Parakeet V2, but supports 24 more languages.
494 MB
Real-time Transcription
Prerecorded Transcription
en
de
es
fr
nl
it
da
et
+17 more
PY
pyannote/flagship
A frontier model for speaker diarization ("who spoke when") with state-of-the-art accuracy (DER) on 13 datasets on OpenBench.
90 MB
Speaker Diarization
language-agnostic
PY
pyannote/open-source
An open-source model for speaker diarization ("who spoke when") with second-best accuracy (DER) on 13 datasets on OpenBench.
15 MB
Speaker Diarization
language-agnostic