Beta
Docs
Diarized Transcription

Diarized Transcription

Assign a speaker to each word

Argmax Pro SDK offers utility functions to merge speaker diarization results from SpeakerKitPro and transcription results from WhisperKitPro or any other transcription engine output that conforms to the TranscriptionResultPro protocol.

Please review Speaker Diarization and File Transcription first. Then, you may combine the results as follows:

// Word timestamps are required for transcripts
let decodingOptions = DecodingOptions(wordTimestamps: true, chunkingStrategy: .vad)
 
// Produce transcription with word timestamps
let transcribeResult = try await whisperKitPro.transcribe(audioArray: audioArray, decodeOptions: decodingOptions)
let mergedResult = WhisperKitProUtils.mergeTranscriptionResults(transcribeResult)
 
// Produce speaker diarization
let diarizationResult = try await speakerKitPro.diarize()
 
// Merge transcription and diarization to add speakers to words
let updatedSegmentsArray = diarizationResult.addSpeakerInfo(to: [mergedResult])
 
for segments in updatedSegmentsArray {
    for segment in segments {
        print(segment)
    }
}