Diatheke is Cobalt’s dialogue management engine. How the dialogues are managed by the engine is determined by a Diatheke model, which is loaded when the engine is initialized.
A Diatheke model describes the desired flow of a conversation between a human user and a computer. It lists different ways a human may say something and specifies how the system should respond. In technical terms, it maps an utterance to an intent and entities and then specifies the actions the system performs in response to those inputs.
There are three participants in a Diatheke model:
Based on input from the user, Diatheke performs one or more pre-defined actions, which may include commands for the client to execute and replies to give to the user in the form of text or audio. For example,
It is important to understand the common natural language understanding (NLU) terms intent and entity before diving deeper into the Diatheke model details. Similarly, understanding the terms story and action are also important. Therefore, those terms are defined below with examples before moving onto the details of building a Diatheke model.
Each time the user says an utterance, there usually is an intent behind it. For example, if the user says “what’s the weather?”, the intent of the user is to learn the current weather. Extracting that intent from the text is a key component of an NLU system.
One of the challenges in intent detection is that the intent may be phrased differently. For example, if the user’s intent is to find out the current weather,
there are many different ways that might be said, such as “what’s the weather?”, “show me the weather report”, “what’s it like outside?”
Even though the phrasing is different (different utterances), they all match the same intent and would get the same response from the system.
An entity is a specific piece of information contained within an utterance. Usually it is part of some range of acceptable or expected values.
For example, in “what is the weather in New York?”, “what’s the forecast for the weekend?”, “what will the weather be like in Boston on Saturday?”, the first utterance includes a location entity (New York), the second includes a date entity (weekend), and the third includes both (Boston, Saturday).
Sample utterances may contain placeholders called “slots” to indicate where an entity might fill in the blank. E.g. “what will the weather be like in ${location} on ${date}?”
Technically, a “slot” is part of an utterance and an “entity” is part of an intent, but since there’s a one-to-one mapping between slots and entities, the terms are often used interchangeably.
Story is the basic component of a dialog flow. Each story is designed to accomplish a goal using interactions between Diatheke and the user.
A story defines the appropriate system responses for recognized intents using lists of actions.
This can be as simple as a single response to a question or can involve complex back-and-forth turns.
For example, a dialogue with a reservation system might include several turns, such as:
User: I’d like to make a reservation this Friday at 7 pm.
Diatheke: For how many people?
User: Two
Diatheke: I’m sorry, there are no tables available at 7 pm. Would you like 6:00 or 8:30?
User: No, those don’t work. What about Saturday?
Diatheke: There is an available table at 7:30 on Saturday, December 5. Would you like to book it?
User: Yes
Diatheke: Great. What name should we use?
User: Martinez
Diatheke: OK, we’ve booked a table for two for Martinez on Saturday, December 5 at 7:30 pm.
During the course of that exchange, the Diatheke system performed a number of steps:
Each of those steps would be defined as either an intent or an action within a “Make reservation” story.