305 lines
8.7 KiB
Markdown
305 lines
8.7 KiB
Markdown
---
|
|
title: "Conversation API"
|
|
sidebar_label: "Conversation API"
|
|
---
|
|
|
|
Intents can be recognized from text and fired using the [conversation integration](https://www.home-assistant.io/integrations/conversation/).
|
|
|
|
An API endpoint is available which receives an input sentence and produces an [conversation response](#conversation-response). A "conversation" is tracked across multiple inputs and responses by passing a [conversation id](#conversation-id) generated by Home Assistant.
|
|
|
|
The API is available via the Rest API and Websocket API.
|
|
|
|
A sentence may be POST-ed to `/api/conversation/process` like:
|
|
|
|
```json
|
|
{
|
|
"text": "turn on the lights in the living room",
|
|
"language": "en"
|
|
}
|
|
```
|
|
|
|
Or sent via the WebSocket API like:
|
|
|
|
```json
|
|
{
|
|
"type": "conversation/process",
|
|
"text": "turn on the lights in the living room",
|
|
"language": "en"
|
|
}
|
|
```
|
|
|
|
The following input fields are available:
|
|
|
|
| Name | Type | Description |
|
|
|-------------------|--------|---------------------------------------------------------------------------------------------|
|
|
| `text` | string | Input sentence. |
|
|
| `language` | string | Optional. Language of the input sentence (defaults to configured language). |
|
|
| `conversation_id` | string | Optional. Unique id to [track conversation](#conversation-id). Generated by Home Assistant. |
|
|
|
|
|
|
## Conversation response
|
|
|
|
The JSON response from `/api/conversation/process` contains information about the effect of the fired intent, for example:
|
|
|
|
```json
|
|
{
|
|
"response": {
|
|
"response_type": "action_done",
|
|
"language": "en",
|
|
"data": {
|
|
"targets": [
|
|
{
|
|
"type": "area",
|
|
"name": "Living Room",
|
|
"id": "living_room"
|
|
},
|
|
{
|
|
"type": "domain",
|
|
"name": "light",
|
|
"id": "light"
|
|
}
|
|
],
|
|
"success": [
|
|
{
|
|
"type": "entity",
|
|
"name": "My Light",
|
|
"id": "light.my_light"
|
|
}
|
|
],
|
|
"failed": [],
|
|
},
|
|
"speech": {
|
|
"plain": {
|
|
"speech": "Turned Living Room lights on"
|
|
}
|
|
}
|
|
},
|
|
"conversation_id": "<generated-id-from-ha>",
|
|
}
|
|
```
|
|
|
|
The following properties are available in the `"response"` object:
|
|
|
|
| Name | Type | Description |
|
|
| --------------- | ---------- | ----------------------------------------------------------------------------------------- |
|
|
| `response_type` | string | One of `action_done`, `query_answer`, or `error` (see [response types](#response-types)). |
|
|
| `data` | dictionary | Relevant data for each [response type](#response_types). |
|
|
| `language` | string | The language of the intent and response. |
|
|
| `speech` | dictionary | Optional. Response text to speak to the user (see [speech](#speech)). |
|
|
|
|
|
|
The [conversation id](#conversation-id) is returned alongside the conversation response.
|
|
|
|
|
|
## Response types
|
|
|
|
### Action done
|
|
|
|
The intent produced an action in Home Assistant, such as turning on a light. The `data` property of the response contains a `targets` list, where each target looks like:
|
|
|
|
| Name | Type | Description |
|
|
|------------|---------|----------------------------------------------------------------------------------------|
|
|
| `type` | string | Target type. One of `area`, `domain`, `device_class`, `device`, `entity`, or `custom`. |
|
|
| `name` | string | Name of the affected target. |
|
|
| `id` | string | Optional. Id of the target. |
|
|
|
|
Two additional target lists are included, containing the devices or entities that were a `success` or `failed`:
|
|
|
|
```json
|
|
{
|
|
"response": {
|
|
"response_type": "action_done",
|
|
"data": {
|
|
"targets": [
|
|
(area or domain)
|
|
],
|
|
"success": [
|
|
(entities/devices that succeeded)
|
|
],
|
|
"failed": [
|
|
(entities/devices that failed)
|
|
]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
An intent can have multiple targets which are applied on top of each other. The targets must be ordered from general to specific:
|
|
|
|
* `area`
|
|
* A [registered area](https://developers.home-assistant.io/docs/area_registry_index/)
|
|
* `domain`
|
|
* Home Assistant integration domain, such as "light"
|
|
* `device_class`
|
|
* Device class for a domain, such as "garage_door" for the "cover" domain
|
|
* `device`
|
|
* A [registered device](https://developers.home-assistant.io/docs/device_registry_index)
|
|
* `entity`
|
|
* A [Home Assistant entity](https://developers.home-assistant.io/docs/architecture/devices-and-services)
|
|
* `custom`
|
|
* A custom target
|
|
|
|
Most intents end up with 0, 1 or 2 targets. 3 targets currenly only happens when device classes are involved. Examples of target combinations:
|
|
|
|
* "Turn off all lights"
|
|
* 1 target: `domain:light`
|
|
* "Turn on the kitchen lights"
|
|
* 2 targets: `area:kitchen`, `domain:light`
|
|
* "Open the kitchen blinds"
|
|
* 3 targets: `area:kitchen`, `domain:cover`, `device_class:blind`
|
|
|
|
|
|
### Query answer
|
|
|
|
The response is an answer to a question, such as "what is the temperature?". See the [speech](#speech) property for the answer text.
|
|
|
|
```json
|
|
{
|
|
"response": {
|
|
"response_type": "query_answer",
|
|
"language": "en",
|
|
"speech": {
|
|
"plain": {
|
|
"speech": "It is 65 degrees"
|
|
}
|
|
},
|
|
"data": {
|
|
"targets": [
|
|
{
|
|
"type": "domain",
|
|
"name": "climate",
|
|
"id": "climate"
|
|
}
|
|
],
|
|
"success": [
|
|
{
|
|
"type": "entity",
|
|
"name": "Ecobee",
|
|
"id": "climate.ecobee"
|
|
}
|
|
],
|
|
"failed": [],
|
|
}
|
|
},
|
|
"conversation_id": "<generated-id-from-ha>",
|
|
}
|
|
```
|
|
|
|
|
|
### Error
|
|
|
|
An error occurred either during intent recognition or handling. See `data.code` for the specific type of error, and the [speech](#speech) property for the error message.
|
|
|
|
```json
|
|
{
|
|
"response": {
|
|
"response_type": "error",
|
|
"language": "en",
|
|
"data": {
|
|
"code": "no_intent_match"
|
|
},
|
|
"speech": {
|
|
"plain": {
|
|
"speech": "Sorry, I didn't understand that"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
`data.code` is a string that can be one of:
|
|
|
|
* `no_intent_match` - The input text did not match any intents.
|
|
* `no_valid_targets` - The targeted area, device, or entity does not exist.
|
|
* `failed_to_handle` - An unexpected error occurred while handling the intent.
|
|
* `unknown` - An error occurred outside the scope of intent processing.
|
|
|
|
|
|
## Speech
|
|
|
|
The spoken response to the user is provided in the `speech` property of the response. It can either be plain text (the default), or [SSML](https://www.w3.org/TR/speech-synthesis11/).
|
|
|
|
For plain text speech, the response will look like:
|
|
|
|
```json
|
|
{
|
|
"response": {
|
|
"response_type": "...",
|
|
"speech": {
|
|
"plain": {
|
|
"speech": "...",
|
|
"extra_data": null
|
|
}
|
|
}
|
|
},
|
|
"conversation_id": "<generated-id-from-ha>",
|
|
}
|
|
```
|
|
|
|
If the speech is [SSML](https://www.w3.org/TR/speech-synthesis11/), it will instead be:
|
|
|
|
```json
|
|
{
|
|
"response": {
|
|
"response_type": "...",
|
|
"speech": {
|
|
"ssml": {
|
|
"speech": "...",
|
|
"extra_data": null
|
|
}
|
|
}
|
|
},
|
|
"conversation_id": "<generated-id-from-ha>",
|
|
}
|
|
```
|
|
|
|
## Conversation Id
|
|
|
|
Conversations can be tracked by a unique id generated from within Home Assistant if supported by the answering conversation agent. To continue a conversation, retrieve the `conversation_id` from the HTTP API response (alongside the [conversation response](#conversation-response)) and add it to the next input sentence:
|
|
|
|
Initial input sentence:
|
|
|
|
```json
|
|
{
|
|
"text": "Initial input sentence."
|
|
}
|
|
```
|
|
|
|
JSON response contains conversation id:
|
|
|
|
```json
|
|
{
|
|
"conversation_id": "<generated-id-from-ha>",
|
|
"response": {
|
|
(conversation response)
|
|
}
|
|
}
|
|
```
|
|
|
|
POST with the next input sentence:
|
|
|
|
```json
|
|
{
|
|
"text": "Related input sentence.",
|
|
"conversation_id": "<generated-id-from-ha>"
|
|
}
|
|
```
|
|
|
|
|
|
## Pre-loading sentences
|
|
|
|
Sentences for a language can be pre-loaded using the WebSocket API:
|
|
|
|
```json
|
|
{
|
|
"type": "conversation/prepare",
|
|
"language": "en"
|
|
}
|
|
```
|
|
|
|
The following input fields are available:
|
|
|
|
| Name | Type | Description |
|
|
|------------|--------|--------------------------------------------------------------------------------|
|
|
| `language` | string | Optional. Language of the sentences to load (defaults to configured language). |
|