202 lines
8.5 KiB
Markdown
202 lines
8.5 KiB
Markdown
# MSC4139: Bot buttons & conversations
|
|
|
|
Nearly all bots and bridges in the Matrix ecosystem use a text-based interface to support their
|
|
operations. These interfaces are typically highly structured commands and require the user to know
|
|
the entire incantation for the action they want to invoke, making them feel like "power user"
|
|
features.
|
|
|
|
Further, interacting with bots today is extremely transactional: the user sends a command and the
|
|
bot performs the action as-is or spews errors back at the user due to a typo. If an error was
|
|
returned, the entire command needs to be re-run.
|
|
|
|
A more user-friendly approach is to have the user provide the bot with information as needed,
|
|
without having to guess at the bot's current state. This proposal calls such an approach a
|
|
"conversation" with the bot - the user does something to "start" the conversation, and the bot
|
|
provides a limited set of prompts to continue the conversation. This repeats until the conversation
|
|
ends (usually by the bot saying so explicitly). Users may hold multiple concurrent conversations
|
|
with bots. Conversation starters are deliberately left as a bot implementation detail in this
|
|
proposal to allow the ecosystem to explore this new interaction technique. Examples may include the
|
|
user opening a DM with the bot, sending a `!command` message, or, in future, sending a slash command
|
|
like `/start`.
|
|
|
|
This conversation approach is heavily inspired by platforms like Telegram.
|
|
|
|
## Proposal
|
|
|
|
A new `m.prompts` [mixin](https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/1767-extensible-events.md#mixins-specifically-allowed)
|
|
is specified which describes actions another user in the room can take to further the conversation.
|
|
|
|
The `m.prompts` mixin contains some scoping parameters, rendering hints, and the actual prompts
|
|
themselves. For example, when applied to an `m.message` event, the `m.prompts` may look like the
|
|
following:
|
|
|
|
*Note*: The JSON comments are normative, and irrelevant fields are not shown.
|
|
|
|
```jsonc
|
|
{
|
|
"type": "m.message",
|
|
"sender": "@bot:example.org",
|
|
"content": {
|
|
"m.text": [
|
|
{"body": "Hello! Say <code>!roll [dice]</code> to roll some dice.", "mimetype": "text/html"},
|
|
{"body": "Hello! Say `!roll [dice]` to get started."}
|
|
],
|
|
"m.prompts": {
|
|
// Clients which recognize `m.prompts` would use `intro` to render the event instead. This
|
|
// allows the remainder of the event to be a fallback for unsupported clients.
|
|
"intro": {
|
|
"type": "m.message",
|
|
"content": {
|
|
"m.text": [
|
|
{"body": "Hello! What would you like to roll today?"}
|
|
]
|
|
}
|
|
},
|
|
// These are the users who should see the `prompts`. Other users may see something like "you
|
|
// do not have permission to reply to this message" instead of prompts. `scope` is optional:
|
|
// when not supplied, all users who can see the message can respond. When an empty array, no
|
|
// one can respond. Clients SHOULD NOT show prompts to users who are descoped.
|
|
"scope": [
|
|
"@alice:example.org",
|
|
"@bob:example.org",
|
|
],
|
|
// These are the options a user has. Note the 2 distinct types and 3 label approaches.
|
|
"prompts": [
|
|
{
|
|
// `type` is the prompt type: "preset" (show a button) or "input" (shown below)
|
|
"type": "preset",
|
|
// `id` is used by the bot to figure out what prompt the user picked. It is an opaque ID.
|
|
"id": "1d6",
|
|
// `label` is an extensible event with deliberately no `type`.
|
|
"label": {
|
|
"m.text": [{"body": "1 six sided die"}]
|
|
}
|
|
}, {
|
|
"type": "preset",
|
|
"id": "surprise",
|
|
"label": {
|
|
// This should render as an image event, hopefully
|
|
// Requires https://github.com/matrix-org/matrix-spec-proposals/pull/3552
|
|
"m.text": [{"body": "🎲❓"}], // fallback
|
|
"m.file": {
|
|
"url": "mxc://example.org/abc123"
|
|
},
|
|
"m.image_details": {
|
|
// Clients should impose maximums and minimums here.
|
|
"width": 16,
|
|
"height": 16
|
|
},
|
|
"m.alt_text": {
|
|
"m.text": [{"body": "An image of a 6 sided die with a red question mark over it"}]
|
|
}
|
|
}
|
|
}, {
|
|
"type": "input",
|
|
"id": "custom",
|
|
// Regex the client can use to test input locally. Optional - if not provided the client
|
|
// should accept *any* input, including an empty string.
|
|
"validator": "[0-9]+d[0-9]+", // `2d20`, etc
|
|
"label": {
|
|
"m.text": [{"body": "Other"}]
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
In this example, clients which don't support the mixin will see the old-style `!roll 2d6` help text,
|
|
allowing the user to continue interacting if needed. Over time, bots may wish to drop this fallback
|
|
style and instead use a message like `Hello! Your client doesn't support talking to me :(`.
|
|
|
|
Clients which do support `m.prompts` will instead render the `intro` object as the event. It's not
|
|
required that the `intro.type` matches the top level event `type`, though it is considered good
|
|
practice to do so. The `intro` block is primarily intended to allow senders to tailor their message
|
|
for supported clients, as the intent for this proposal is to discourage commands like `!roll` where
|
|
possible.
|
|
|
|
Prompts SHOULD be rendered in order of the array, and appear below the `intro` rendering. Buttons
|
|
SHOULD be used for `preset` prompts, using the provided `label`, and text inputs with `label` as a
|
|
prefix or placeholder, and validation per `validator`, SHOULD be used for `input` prompts. For
|
|
example:
|
|
|
|
![](./images/4139-01-dice-bot-welcome.png)
|
|
|
|
[Codepen](https://codepen.io/turt2live/pen/gOyVvaY) (note: doesn't do validation)
|
|
|
|
The user is then able to click on one of the buttons or submit text through the `input` option. That
|
|
reply looks as follows:
|
|
|
|
```jsonc
|
|
{
|
|
"type": "m.conversation.reply",
|
|
"sender": "@alice:example.org",
|
|
"content": {
|
|
"m.in_reply_to": { // TODO: Change to match Extensible Event replies
|
|
"event_id": "$previousMessage",
|
|
"rel_type": "m.thread" // yes, we use threads!
|
|
},
|
|
// Whichever option the user clicked is described here in a new content block.
|
|
"m.used_prompt": {
|
|
"id": "surprise"
|
|
},
|
|
// We then add all the fallback representations. For `preset` prompts, this is typically just
|
|
// the `label` verbatim. `input` prompts may require some creative editing, like "Other: 2d20".
|
|
"m.text": [{"body": "🎲❓"}], // fallback for the image
|
|
"m.file": {
|
|
"url": "mxc://example.org/abc123"
|
|
},
|
|
"m.image_details": {
|
|
"width": 16,
|
|
"height": 16
|
|
},
|
|
"m.alt_text": {
|
|
"m.text": [{"body": "An image of a 6 sided die with a red question mark over it"}]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
The bot can then process this and continue the conversation as needed, using more `m.prompts` mixins
|
|
to get the information it needs from the user. If the bot considers the conversation/thread to be
|
|
complete, it sends an event with no `m.prompts` mixin to the thread. In our example of a dice bot,
|
|
this could be the result of the roll.
|
|
|
|
Once a user has picked (and sent) a prompt, the client SHOULD disable the user's ability to send
|
|
another. This could be done by hiding all options, or using the HTML `disabled` attribute.
|
|
|
|
The example dice bot would then start a new conversation by sending a new welcome message, likely
|
|
with different text to feel less mechanical. For example: "What are we rolling next? [1d6] [...]".
|
|
|
|
It is left as a bot implementation detail to handle multiple responses, responses from descoped
|
|
users, and invalid input. Typically this would be handled by the bot using a threaded reply to the
|
|
sender saying "sorry, you don't have permission to interact here" or "sorry, I didn't catch that.
|
|
[same prompts as original message]".
|
|
|
|
## Potential issues
|
|
|
|
TODO
|
|
|
|
## Alternatives
|
|
|
|
[MSC3006](https://github.com/matrix-org/matrix-spec-proposals/pull/3006) is very similar to this
|
|
proposal. Instead of starting per-message threads, it defines interactions via a state event. This
|
|
makes MSC3006 more akin to a "conversation starter" replacement, to use this MSC's terminology.
|
|
|
|
## Security considerations
|
|
|
|
TODO
|
|
|
|
## Unstable prefix
|
|
|
|
While this proposal is not considered stable, clients should use `org.matrix.msc4139.` in place of
|
|
`m.` in all identifiers.
|
|
|
|
TODO: Language to support usage in room versions without Extensible Events support, similar to
|
|
[MSC3381: Polls](https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/3381-polls.md).
|
|
|
|
## Dependencies
|
|
|
|
This MSC has no direct dependencies.
|