Docs
Diarized Transcription
Diarized Transcription
Assign a speaker to each word
Argmax Pro SDK offers utility functions to merge speaker diarization results from SpeakerKitPro
and transcription results from WhisperKitPro
or any other transcription engine output that conforms to the TranscriptionResultPro
protocol.
Please review Speaker Diarization and File Transcription first. Then, you may combine the results as follows:
// Word timestamps are required for transcripts
let decodingOptions = DecodingOptions(wordTimestamps: true, chunkingStrategy: .vad)
// Produce transcription with word timestamps
let transcribeResult = try await whisperKitPro.transcribe(audioArray: audioArray, decodeOptions: decodingOptions)
let mergedResult = WhisperKitProUtils.mergeTranscriptionResults(transcribeResult)
// Produce speaker diarization
let diarizationResult = try await speakerKitPro.diarize()
// Merge transcription and diarization to add speakers to words
let updatedSegmentsArray = diarizationResult.addSpeakerInfo(to: [mergedResult])
for segments in updatedSegmentsArray {
for segment in segments {
print(segment)
}
}