Before we can run text-to-speech, the client needs to specify the configuration Luna should use to generate the audio. The Luna options include the specific voice model, the audio encoding, and the sample size. Except where noted, these options apply to both batch synthesis and streaming synthesis.
Voice models are defined in the Luna server config file, and the client
can get a list of available voice models using the ListVoices
method.
// Get the list of voices from Luna
response, err := client.ListVoices(context.Background())
// Display the list
fmt.Printf("Available voices:\n")
for _, voice := range response.Voices {
fmt.Printf(" %+v\n", voice)
}
# Get the list of voices from Luna
response = client.ListVoices()
# Display the list
print("Available voices:")
for v in response.voices:
print(" ID: {} Name: {} Sample Rate(Hz): {} Language: {}".format(
v.id, v.name, v.sample_rate, v.language))
// Get the list of voices from Luna
std::vector<LunaVoice> voices = client.listVoices();
// Display the list
std::cout << "Available voices:" << std::endl;
for (const LunaVoice &v : voices)
{
std::cout << " ID: " << v.id()
<< " Name: " << v.name()
<< " Sample Rate(Hz): " << v.sampleRate()
<< " Language: " << v.language() << "\n";
}
The API requires the voice ID to be specified as part of the TTS synthesis request. The voice ID for each model is defined in the server’s config file.
The API requires that the audio encoding also be specified as part of the TTS synthesis request. The supported encodings are listed in the Luna API Reference.
When running streaming synthesis, a client may optionally specify a buffer size. If set (and not zero), the server will wait until the buffer is full before sending the audio data to the client. In the case where the entire generated audio is less than the buffer size, the samples will be returned when synthesis is complete. If the buffer size is not set, audio data will be returned as soon as it becomes available on the server.
Below is an example showing how client code should set up the synthesis config.
// Set up the synthesis config with the desired voice and encoding
synthConfig := &lunapb.SynthesizerConfig{
VoiceId: "1", // As defined in the Luna server config file
Encoding: lunapb.SynthesizerConfig_RAW_LINEAR16,
// If setting the buffer size for streaming, include this param
// with the application's desired buffer size.
NSamples: 8096,
}
# Set up the synthesis config with the desired voice and encoding
synth_config = SynthesizerConfig(
voice_id="1", # As defined in the Luna server config file
encoding=SynthesizerConfig.RAW_LINEAR16,
# If setting the buffer size for streaming, include this arg with
# the application's desired buffer size.
n_samples=8096)
// Set up the synthesis config with the desired voice and encoding
cobaltspeech::luna::SynthesizerConfig synthConfig;
synthConfig.set_voice_id("1"); // As defined in the Luna server config file
synthConfig.set_encoding(cobaltspeech::luna::SynthesizerConfig::RAW_LINEAR16);
// If setting the buffer size for streaming, include this line with the
// application's desired buffer size.
synthConfig.set_n_samples(8096);