Diarization Configurations

An in-depth explanation of the various fields of the complete SDK can be found here. The sub-section DiarizationConfig details the options that can be set when sending a diarization request.

This page here discusses the more common combinations sent to the server.

Fields

Field Required Default Description
model_id Yes

ID of the diarization model to use on the server. Can be obtained by first getting list of models on the server via ListModels().

num_speakers Yes

The number of speakers expected in the audio; specifying the correct number of speakers improves the accuracy of the speaker labels. If the number of speakers is unknown, set to 0.

audio_encoding Yes

Encoding of audio data sent/streamed through the DiarizationAudio messages. For encodings like WAV/FLAC that have headers, the headers are expected to be sent at the beginning of the stream, not in every DiarizationAudio message.

sample_rate Yes

Sampling rate of the audio to process.

cubic_model_id Yes if transcription required ""

Unique identifier of the cubic model to be used for speech recognition. If this value is specified, transcription results from the cubic model with the given ID will also be returned alongside speaker labels. If it omitted or blank, the results will not include transcripts, even if Cubic server was included in the deployed image.

enable_raw_transcript No False

If true, the raw transcript (unformatted) will be included in the results (only has an effect if Cubicsvr also set up with Juzusvr).