developers.home-assistant/docs/intent_conversation_api.md

305 lines
8.7 KiB
Markdown

---
title: "Conversation API"
sidebar_label: "Conversation API"
---
Intents can be recognized from text and fired using the [conversation integration](https://www.home-assistant.io/integrations/conversation/).
An API endpoint is available which receives an input sentence and produces an [conversation response](#conversation-response). A "conversation" is tracked across multiple inputs and responses by passing a [conversation id](#conversation-id) generated by Home Assistant.
The API is available via the Rest API and Websocket API.
A sentence may be POST-ed to `/api/conversation/process` like:
```json
{
"text": "turn on the lights in the living room",
"language": "en"
}
```
Or sent via the WebSocket API like:
```json
{
"type": "conversation/process",
"text": "turn on the lights in the living room",
"language": "en"
}
```
The following input fields are available:
| Name | Type | Description |
|-------------------|--------|---------------------------------------------------------------------------------------------|
| `text` | string | Input sentence. |
| `language` | string | Optional. Language of the input sentence (defaults to configured language). |
| `conversation_id` | string | Optional. Unique id to [track conversation](#conversation-id). Generated by Home Assistant. |
## Conversation response
The JSON response from `/api/conversation/process` contains information about the effect of the fired intent, for example:
```json
{
"response": {
"response_type": "action_done",
"language": "en",
"data": {
"targets": [
{
"type": "area",
"name": "Living Room",
"id": "living_room"
},
{
"type": "domain",
"name": "light",
"id": "light"
}
],
"success": [
{
"type": "entity",
"name": "My Light",
"id": "light.my_light"
}
],
"failed": [],
},
"speech": {
"plain": {
"speech": "Turned Living Room lights on"
}
}
},
"conversation_id": "<generated-id-from-ha>",
}
```
The following properties are available in the `"response"` object:
| Name | Type | Description |
| --------------- | ---------- | ----------------------------------------------------------------------------------------- |
| `response_type` | string | One of `action_done`, `query_answer`, or `error` (see [response types](#response-types)). |
| `data` | dictionary | Relevant data for each [response type](#response_types). |
| `language` | string | The language of the intent and response. |
| `speech` | dictionary | Optional. Response text to speak to the user (see [speech](#speech)). |
The [conversation id](#conversation-id) is returned alongside the conversation response.
## Response types
### Action done
The intent produced an action in Home Assistant, such as turning on a light. The `data` property of the response contains a `targets` list, where each target looks like:
| Name | Type | Description |
|------------|---------|----------------------------------------------------------------------------------------|
| `type` | string | Target type. One of `area`, `domain`, `device_class`, `device`, `entity`, or `custom`. |
| `name` | string | Name of the affected target. |
| `id` | string | Optional. Id of the target. |
Two additional target lists are included, containing the devices or entities that were a `success` or `failed`:
```json
{
"response": {
"response_type": "action_done",
"data": {
"targets": [
(area or domain)
],
"success": [
(entities/devices that succeeded)
],
"failed": [
(entities/devices that failed)
]
}
}
}
```
An intent can have multiple targets which are applied on top of each other. The targets must be ordered from general to specific:
* `area`
* A [registered area](https://developers.home-assistant.io/docs/area_registry_index/)
* `domain`
* Home Assistant integration domain, such as "light"
* `device_class`
* Device class for a domain, such as "garage_door" for the "cover" domain
* `device`
* A [registered device](https://developers.home-assistant.io/docs/device_registry_index)
* `entity`
* A [Home Assistant entity](https://developers.home-assistant.io/docs/architecture/devices-and-services)
* `custom`
* A custom target
Most intents end up with 0, 1 or 2 targets. 3 targets currenly only happens when device classes are involved. Examples of target combinations:
* "Turn off all lights"
* 1 target: `domain:light`
* "Turn on the kitchen lights"
* 2 targets: `area:kitchen`, `domain:light`
* "Open the kitchen blinds"
* 3 targets: `area:kitchen`, `domain:cover`, `device_class:blind`
### Query answer
The response is an answer to a question, such as "what is the temperature?". See the [speech](#speech) property for the answer text.
```json
{
"response": {
"response_type": "query_answer",
"language": "en",
"speech": {
"plain": {
"speech": "It is 65 degrees"
}
},
"data": {
"targets": [
{
"type": "domain",
"name": "climate",
"id": "climate"
}
],
"success": [
{
"type": "entity",
"name": "Ecobee",
"id": "climate.ecobee"
}
],
"failed": [],
}
},
"conversation_id": "<generated-id-from-ha>",
}
```
### Error
An error occurred either during intent recognition or handling. See `data.code` for the specific type of error, and the [speech](#speech) property for the error message.
```json
{
"response": {
"response_type": "error",
"language": "en",
"data": {
"code": "no_intent_match"
},
"speech": {
"plain": {
"speech": "Sorry, I didn't understand that"
}
}
}
}
```
`data.code` is a string that can be one of:
* `no_intent_match` - The input text did not match any intents.
* `no_valid_targets` - The targeted area, device, or entity does not exist.
* `failed_to_handle` - An unexpected error occurred while handling the intent.
* `unknown` - An error occurred outside the scope of intent processing.
## Speech
The spoken response to the user is provided in the `speech` property of the response. It can either be plain text (the default), or [SSML](https://www.w3.org/TR/speech-synthesis11/).
For plain text speech, the response will look like:
```json
{
"response": {
"response_type": "...",
"speech": {
"plain": {
"speech": "...",
"extra_data": null
}
}
},
"conversation_id": "<generated-id-from-ha>",
}
```
If the speech is [SSML](https://www.w3.org/TR/speech-synthesis11/), it will instead be:
```json
{
"response": {
"response_type": "...",
"speech": {
"ssml": {
"speech": "...",
"extra_data": null
}
}
},
"conversation_id": "<generated-id-from-ha>",
}
```
## Conversation Id
Conversations can be tracked by a unique id generated from within Home Assistant if supported by the answering conversation agent. To continue a conversation, retrieve the `conversation_id` from the HTTP API response (alongside the [conversation response](#conversation-response)) and add it to the next input sentence:
Initial input sentence:
```json
{
"text": "Initial input sentence."
}
```
JSON response contains conversation id:
```json
{
"conversation_id": "<generated-id-from-ha>",
"response": {
(conversation response)
}
}
```
POST with the next input sentence:
```json
{
"text": "Related input sentence.",
"conversation_id": "<generated-id-from-ha>"
}
```
## Pre-loading sentences
Sentences for a language can be pre-loaded using the WebSocket API:
```json
{
"type": "conversation/prepare",
"language": "en"
}
```
The following input fields are available:
| Name | Type | Description |
|------------|--------|--------------------------------------------------------------------------------|
| `language` | string | Optional. Language of the sentences to load (defaults to configured language). |