Managing Model Files
Picking, downloading, and compiling models in your application
Picking the Model
Model Gallery lists all models supported by Argmax SDK. The following sections summarize considerations when picking the best speech-to-text model for your application.
Custom Models. If you need to bring your own custom model checkpoints for supported architectures, please see Bring Your Own Models, or reach out to Argmax on your Slack support channel or over email.
Nvidia Parakeet v2
This model is 9x faster than Whisper Large v3 Turbo on English speech-to-text and achieves slightly higher accuracy.
We recommend using this model for all applications that are English-only.
let config = WhisperKitProConfig(
model: "parakeet-v2_476MB",
modelRepo: .parakeetRepo
)iOS must use compressed models. Please use parakeet-v2_476MB instead of parakeet-v2 for iOS apps. This compressed model is benchmarked and verified to achieve an accuracy within 0.5% of the original non-compresssed model.
Nvidia Parakeet v3
This model achieves the same speed as Nvidia Parakeet v2 but supports 25 European languages with dynamic switching: en, de, es, fr, nl, it, da, et, fi, el, hu, lv, lt, mt, pl, pt, ro, sk, sl, sv, ru, uk, bg, hr, cs.
We recommend using this model for all applications that require at least one of the 25 languages from above other than the English-only scenario.
let config = WhisperKitProConfig(
model: "parakeet-v3_494MB",
modelRepo: .parakeetRepo
)OpenAI Whisper Large v3 Turbo
The original models for WhisperKit (now called Argmax OSS) are hosted under the .openSourceRepo repository.
Argmax Pro SDK hosts a second set of Whisper models under the .proRepo repository that are further optimized for speed and energy-efficiency compared to their .openSourceRepo counterparts. During this upgrade, accuracy remains identical while speed and energy-efficiency improve significantly.
Usage:
let config = WhisperKitProConfig(
model: "large-v3-v20240930_626MB",
modelRepo: .proRepo // or .openSourceRepo
)OS Compatibility. Note that .proRepo models support iOS 18/macOS 15 and newer. For users still on iOS 17/macOS 14, please fall back to .openSourceRepo counterparts.
Downloading the Model
Initialize Argmax SDK
Argmax SDK requires initialization with an Argmax API key (starts with ax_***, not axst_***) to unlock Pro models and features.
We recommend fetching your API key securely from your backend in production. However, we provide a simple obfuscator to protect against casual inspection and static analysis tools.
Assuming you use ObfuscatedKeyProvider.generateCodeExample to obfuscate your API key, you may initialize Argmax SDK as follows:
var keyProvider = ObfuscatedKeyProvider(mask: 37) // placeholder values
keyProvider.apiKeyObfuscated = [4, 5, 6] // placeholder values
guard let apiKey = keyProvider.apiKey else {
fatalError("Missing API key")
}
await ArgmaxSDK.with(ArgmaxConfig(apiKey: apiKey))Note that ArgmaxSDK.with(ArgmaxConfig(apiKey: apiKey)) requires an internet connection during first use and at least once every 30 days to maintain an active license. See this documentation page to learn more.
Initiate Download
ModelStore implements a robust model downloader. On iOS and iPadOS, this model downloader persists progress across foreground-to-background and background-to-foreground app transitions. It evens persists the download progress after the app is killed.
This downloader is designed to enable your application to set up Argmax in the background without blocking the user on a download spinner.
Forward background URLSession events from your UIApplicationDelegate so the SDK can drain them before iOS re-suspends the app:
func application(
_ application: UIApplication,
handleEventsForBackgroundURLSession identifier: String,
completionHandler: @escaping () -> Void
) {
modelStore.backgroundDownloader.handleEventsForBackgroundSession(
identifier: identifier,
completionHandler: completionHandler
)
}For SwiftUI apps without an AppDelegate, wire one in with @UIApplicationDelegateAdaptor.
import Network // for NWInterface.InterfaceType
let modelStore = ModelStore(config: config) // `config` from "Picking the Model"
do {
let result = try await modelStore.downloadModelInBackground(
name: "large-v3-v20240930_626MB",
repo: RepoType.proRepo,
disabledNetworkTypes: [.cellular] // optional: Wi-Fi only
)
switch result {
case .started(let downloadId):
print("Download started: \(downloadId)")
case .resumed(let downloadId):
print("Resuming existing download: \(downloadId)")
case .alreadyInProgress(let downloadId):
print("Download already running: \(downloadId)")
case .waitingForNetwork(let downloadId):
print("Queued, will resume when Wi-Fi is available: \(downloadId)")
case .alreadyComplete(let modelPath):
print("Model already downloaded at: \(modelPath)")
}
} catch {
print("Failed to start download: \(error)")
}Query State
modelStore.getBackgroundDownloadState returns the following state:
public struct BackgroundDownloadState {
let downloadId: String
let modelVariant: String
let repoId: String
var files: [BackgroundFileDownload]
var status: BackgroundDownloadStatus // .pending, .downloading, .paused, .completed, .failed
let startedAt: Date
var completedAt: Date?
var overallProgress: Double // 0.0 to 1.0
}Here is a simple way to query download state:
// Get all active downloads
let downloads = modelStore.activeBackgroundDownloads
// Get state for first active download
if let download = downloads.first {
print("Progress: \(download.overallProgress)")
print("Files completed: \(download.completedFileCount)/\(download.totalFileCount)")
}For reactive updates, AsyncStream is the preferred API for new code. Each subscriber receives the current snapshot immediately, then every subsequent change:
Task {
for await downloads in modelStore.backgroundDownloader.activeDownloadsUpdates {
for download in downloads {
let pct = Int(download.overallProgress * 100)
print("\(download.modelVariant): \(pct)% status=\(download.status.rawValue)")
}
}
}A Combine projection (modelStore.backgroundDownloadsPublisher) is also available for back-compat:
import Combine
var cancellables = Set<AnyCancellable>()
modelStore.backgroundDownloadsPublisher
.receive(on: DispatchQueue.main)
.sink { downloads in
for download in downloads {
print("\(download.modelVariant): \(Int(download.overallProgress * 100))%")
print("Status: \(download.status)")
// Individual file progress
for file in download.files {
print(" \(file.destinationURL.lastPathComponent): \(file.status)")
}
}
}
.store(in: &cancellables)Pause and Resume
modelStore.pauseBackgroundDownload(downloadId)do {
try await modelStore.resumeBackgroundDownload(downloadId)
} catch {
print("Failed to resume: \(error)")
}Cancel
modelStore.cancelBackgroundDownload(downloadId, deleteProgress: false)Network Interface Restrictions
disabledNetworkTypes accepts a list of NWInterface.InterfaceType values that the download is not allowed to use. The SDK auto-pauses when the active path uses a disabled interface and auto-resumes when an allowed one returns. The restriction is persisted with the download record and survives app relaunches.
Change a restriction at any time without disturbing the in-flight transfer:
// Lift the restriction (e.g. user tapped "Allow on cellular"):
modelStore.setDisabledBackgroundDownloadNetworkTypes(nil, for: downloadId)
// Add a restriction mid-flight:
modelStore.setDisabledBackgroundDownloadNetworkTypes([.cellular], for: downloadId)Crash Recovery
Downloads persisted as in-flight but with no live URLSession tasks on relaunch (the signature of an abnormal exit such as a crash, force-quit, or device reboot) auto-resume after ModelStore initializes. No manual user action is required.
Downloads explicitly paused by the user stay paused across relaunches; only crash-paused downloads auto-resume.
Loading the Model
After downloading, call loadModels() to load the MelSpectrogram, AudioEncoder, and TextDecoder models from the model folder into memory. The first load after a download may take 15–90 seconds because CoreML compiles the models on-device. Subsequent loads are near-instant thanks to the OS-level compiled model cache.
let whisperKitPro = try await WhisperKitPro(config) // same config used during "Downloading the Model"
// loadModels() is called automatically during WhisperKitPro initialization.
// You can also call it explicitly if you initialized with `load: false`:
try await whisperKitPro.loadModels()loadModels() supports a prewarmMode parameter. When prewarmMode: true, models are loaded but not fully initialized, allowing you to defer the final initialization to a later point.
Once the model is compiled during first use, Apple caches the compiled model in an OS-level cache (not accessible by Argmax). This enables subsequent model loads to be near-instant due to compiled model cache hits. However, Apple evicts this cache after each OS update or after extended periods of non-use (~14 days). Consider eagerly initializing WhisperKitPro before the user engages to reduce the chances of a compiled model cache miss leading to the user being blocked during recompilation.
Bring Your Own Model
argmax-sdk-swift-2.1.2 and newer supports downloading model files directly from a custom HTTPS URL such as S3 presigned URLs, GCP signed URLs, Azure SAS URLs or your own CDN.
The model folder must be archived as a Apple Archive (.aar). The model folder should preserve the directory structure of the corresponding Argmax ready-made Hugging Face-hosted model.
For example, the .aar file hosted at https://models.example.com/whisperkit-coreml/openai_whisper-tiny.aar should match this directory structure:
let archiveURL = URL(string:
"https://models.example.com/whisperkit-coreml/openai_whisper-tiny.aar?X-Amz-Signature=..."
)!
// One-step: download + extract + register with the model cache.
let modelFolder = try await modelStore.downloadAndExtractInBackground(
remoteURL: archiveURL,
destinationRoot: URL.documentsDirectory.appendingPathComponent("Models"),
disabledNetworkTypes: [.cellular] // optional: Wi-Fi only
)
let config = WhisperKitProConfig(modelFolder: modelFolder.path)
let whisperKit = try await WhisperKitPro(config)- The download request does not carry an
Authorizationheader. URLs that carry their signature in the query string work without further configuration. - Repeated calls with rotated pre-signed URL for the same object returns
.alreadyCompleteinstead of re-downloading. Override withdownloadKey:/namespace:if you need full control. - If download or extract fails for the
.aarfile, partial output is cleaned up so the next attempt starts from a known-empty folder. - The
.aarfile is removed after successful extraction. - The final extracted local folder is registered with the download cache so loading models do not attempt to redownload from the configured Hugging Face repo.
- For full control, e.g. pause/resume, mid-flight network-type changes, progress observation, switch to
modelStore.backgroundDownloader.startDownload(remoteURL:...)and follow the patterns covered in Network Interface Restrictions.