Models
Browse ready-made models for Argmax Pro SDK.
Please refer to OpenBench for open-source and reproducible latency and accuracy competitive benchmarks
openai/whisper-large-v3-turbo
The most recent iteration of Whisper
nvidia/parakeet-v2
A frontier model that surpasses OpenAI Whisper Large V3 Turbo on English speech-to-text accuracy while being ~9x faster
nvidia/sortformer-v2-1
A frontier real-time speaker diarization model that surpasses top cloud APIs on accuracy
nvidia/parakeet-v3
The most recent iteration of Parakeet. Same size and speed as Parakeet V2, but supports 24 more languages.
pyannote/precision
A frontier model for speaker diarization ("who spoke when") with state-of-the-art accuracy (DER) on 13 datasets on OpenBench.
pyannote/community
An open-source model for speaker diarization ("who spoke when") with second-best accuracy (DER) on 13 datasets on OpenBench.
qwen/qwen3-tts-0.6b
A multilingual text-to-speech model with voice cloning and low-latency streaming support
qwen/qwen3-tts-1.7b
A frontier multilingual text-to-speech model with voice cloning, voice design from text instructions