The Diatheke API is defined using gRPC and protocol buffers. This section of the documentation is auto-generated from the protobuf file. It describes the data types and functions defined in the spec. The “messages” below correspond to the data structures to be used, and the “service” contains the methods that can be called.
Service that implements the Cobalt Diatheke Dialog Management API.
Method Name | Request Type | Response Type | Description |
---|---|---|---|
Version | Empty | VersionResponse | Returns version information from the server. |
ListModels | Empty | ListModelsResponse | ListModels returns information about the Diatheke models the server can access. |
CreateSession | SessionStart | SessionOutput | Create a new Diatheke session. Also returns a list of actions to take next. |
DeleteSession | TokenData | Empty | Delete the session. Behavior is undefined if the given TokenData is used again after this function is called. |
UpdateSession | SessionInput | SessionOutput | Process input for a session and get an updated session with a list of actions to take next. This is the only method that modifies the Diatheke session state. |
StreamASR | ASRInput | ASRResult | Create an ASR stream. A result is returned when the stream is closed by the client (which forces the ASR to endpoint), or when a transcript becomes available on its own, in which case the stream is closed by the server. The ASR result may be used in the UpdateSession method. |
StreamTTS | ReplyAction | TTSAudio | Create a TTS stream to receive audio for the given reply. The stream will close when TTS is finished. The client may also close the stream early to cancel the speech synthesis. |
Data to send to the ASR stream. The first message on the stream must be the session token followed by audio data.
Field | Type | Label | Description |
---|---|---|---|
token | TokenData | Session data, used to determine the correct Cubic model to use for ASR, with other contextual information. |
|
audio | bytes | Audio data to transcribe. |
The result from the ASR stream, sent after the ASR engine has endpointed or the stream was closed by the client.
Field | Type | Label | Description |
---|---|---|---|
text | string | The transcription. |
|
confidence | double | Confidence estimate between 0 and 1. A higher number represents a higher likelihood of the output being correct. |
|
timedOut | bool | True if a timeout was defined for the session’s current input state in the Diatheke model, and the timeout expired before getting a transcription. This timeout refers to the amount of time a user has to verbally respond to Diatheke after the ASR stream has been created, and should not be confused with a network connection timeout. |
Specifies an action that the client application should take.
Field | Type | Label | Description |
---|---|---|---|
input | WaitForUserAction | The user must provide input to Diatheke. |
|
command | CommandAction | The client app must execute the specified command. |
|
reply | ReplyAction | The client app should provide the reply to the user. |
This action indicates that the client application should execute a command.
Field | Type | Label | Description |
---|---|---|---|
id | string | The ID of the command to execute, as defined in the Diatheke model. |
|
input_parameters | CommandAction.InputParametersEntry | repeated |
Field | Type | Label | Description |
---|---|---|---|
key | string | ||
value | string |
The result of executing a command.
Field | Type | Label | Description |
---|---|---|---|
id | string | The command ID, as given by the CommandAction |
|
out_parameters | CommandResult.OutParametersEntry | repeated | Output from the command expected by the Diatheke model. For example, this could be the result of a data query. |
error | string | If there was an error during execution, indicate it here with a brief message that will be logged by Diatheke. |
Field | Type | Label | Description |
---|---|---|---|
key | string | ||
value | string |
This message is empty and has no fields.
A list of models available on the Diatheke server.
Field | Type | Label | Description |
---|---|---|---|
models | ModelInfo | repeated |
Information about a single Diatheke model.
Field | Type | Label | Description |
---|---|---|---|
id | string | Diatheke model ID, which is used to create a new session. |
|
name | string | Pretty model name, which may be used for display purposes. |
|
language | string | Language code of the model. |
|
asr_sample_rate | uint32 | The ASR audio sample rate, if ASR is enabled. |
|
tts_sample_rate | uint32 | The TTS audio sample rate, if TTS is enabled. |
This action indicates that the client application should give the provided text to the user. This action may also be used to synthesize speech with the StreamTTS method.
Field | Type | Label | Description |
---|---|---|---|
text | string | Text of the reply |
|
luna_model | string | TTS model to use with the TTSReply method |
Used by Diatheke to update the session state.
Field | Type | Label | Description |
---|---|---|---|
token | TokenData | The session token. |
|
text | TextInput | Process the user supplied text. |
|
asr | ASRResult | Process an ASR result. |
|
cmd | CommandResult | Process the result of a completed command. |
|
story | SetStory | Change the current session state. |
The result of updating a session.
Field | Type | Label | Description |
---|---|---|---|
token | TokenData | The updated session token. |
|
action_list | ActionData | repeated | The list of actions the client should take next, using the session token returned with this result. |
Used to create a new session.
Field | Type | Label | Description |
---|---|---|---|
model_id | string | Specifies the Diatheke model ID to use for the session. |
Changes the current state of a Diatheke session to run at the specified story.
Field | Type | Label | Description |
---|---|---|---|
story_id | string | The ID of the story to run, as defined in the Diatheke model. |
|
parameters | SetStory.ParametersEntry | repeated | A list of parameters to set before running the given story. This will replace any parameters currently defined in the session. |
Field | Type | Label | Description |
---|---|---|---|
key | string | ||
value | string |
Contains synthesized speech audio. The specific encoding is defined in the server config file.
Field | Type | Label | Description |
---|---|---|---|
audio | bytes |
User supplied text to send to Diatheke for processing.
Field | Type | Label | Description |
---|---|---|---|
text | string |
A token that represents a single Diatheke session and its current state.
Field | Type | Label | Description |
---|---|---|---|
data | bytes | ||
id | string | Session ID, useful for correlating logging between a client and the server. |
|
metadata | string | Additional data supplied by the client app, which will be logged with other session info by the server. |
Lists the version of Diatheke and the engines it uses.
Field | Type | Label | Description |
---|---|---|---|
diatheke | string | Dialog management engine |
|
chosun | string | NLU engine |
|
cubic | string | ASR engine |
|
luna | string | TTS engine |
This action indicates that Diatheke is expecting user input.
Field | Type | Label | Description |
---|---|---|---|
requires_wake_word | bool | True if the next user input must begin with a wake-word. |
|
immediate | bool | True if the input is required immediately (i.e., in response to a question Diatheke asked the user). When false, the client should be allowed to wait indefinitely for the user to provide input. |
See the protocol buffer documentation for these
.proto Type | Notes |
---|---|
Duration | Represents a signed, fixed-length span of time represented as a count of seconds and fractions of seconds at nanosecond resolution |
Empty | Used to indicate a method takes or returns nothing |