249 lines
12 KiB
Markdown
249 lines
12 KiB
Markdown
# MSC3819: Allowing widgets to send/receive to-device messages
|
|
|
|
Widgets (embedded HTML applications in Matrix) currently have a relatively large surface area
|
|
they can use for interacting with their attached client, primarily in the context of a room. They
|
|
can [send/receive events with MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762),
|
|
[navigate to rooms with MSC2931](https://github.com/matrix-org/matrix-spec-proposals/pull/2931),
|
|
and even [open dialogs with MSC2790](https://github.com/matrix-org/matrix-spec-proposals/pull/2790),
|
|
but they can't act as a whole other Matrix client just yet.
|
|
|
|
This MSC forms part of a larger, ongoing, question about how to embed other Matrix clients into another
|
|
client or room for access. An increasingly more popular client development option is to build out an
|
|
entirely new Matrix client and want to embed that within another client (as a widget) to avoid the
|
|
user needing to switch apps. To support this, we need to consider both long term and short term impact
|
|
of the changes we propose. This MSC aims closer to the short term.
|
|
|
|
A longer term solution to the problem of clients wanting to be embedded in other clients might still
|
|
be widgets, though with a system like [MSC3008](https://github.com/matrix-org/matrix-spec-proposals/pull/3008)
|
|
to restrict access to the client-server API more effectively. For this MSC's purpose though, we're
|
|
aiming to cover a specific subset of the client-server API: to-device messages.
|
|
|
|
While we could expose the entire client-server API over `postMessage` (or similar) for embedded
|
|
clients to access, the permissions model gets hairy and difficult to secure on the client side. Instead,
|
|
we're exploring what it would look like to special case what is needed for specific applications, as
|
|
needed, starting with to-device messages.
|
|
|
|
To-device messaging is described [here](https://spec.matrix.org/v1.2/client-server-api/#send-to-device-messaging)
|
|
with practical applications for widget-ized clients being implementations of
|
|
[MSC3401 - Native group VoIP](https://github.com/matrix-org/matrix-spec-proposals/pull/3401) for now.
|
|
|
|
## Prerequisite background
|
|
|
|
*Author's note: This is copied from [MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762).*
|
|
|
|
Widgets are relatively new to Matrix and so the terminology and behaviour might not be known to all
|
|
readers. This section should clarify the components of widgets that are applicable to this MSC without
|
|
going on a deep dive into widgets in general.
|
|
|
|
Widgets are embedded HTML/JS/CSS applications in a client which use the `postMessage` API to talk
|
|
to the client. This communication allows widgets to provide enhanced functionality such as sticker
|
|
pickers (when applied to a user) or performance dashboards (in rooms).
|
|
|
|
One of the first things that happens over this communication channel is a "capabilities negotiation"
|
|
where the client asks the widget what permissions it wants, and the widget replies with its ideal
|
|
set. The client then either decides or asks the user if the permissions requested are okay.
|
|
|
|
All communication over the channel is done in a simple request/response flow, using actions to
|
|
describe the request. For the capabilities negotiation, this would be the client sending the widget
|
|
a request with an `action` of `capabilities`, and the widget would respond to that request with a
|
|
response object.
|
|
|
|
The channel in which communication occurs is called a "session", where the session is "established"
|
|
after the capabilities negotiation. Sessions can only be terminated by the client.
|
|
|
|
The Widget API is split into two parts: `toWidget` (client->widget) and `fromWidget` (widget->client).
|
|
They are differentiated by where the request originates.
|
|
|
|
## Proposal
|
|
|
|
Inspired heavily from [MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762), we
|
|
introduce new capabilities for to-device messages:
|
|
|
|
* `m.send.to_device:<event type>` (eg: `m.send.to_device:m.call.invite`) - Used for sending to-device
|
|
messages of a given type.
|
|
* `m.receive.to_device:<event type>` (eg: `m.receive.to_device:m.call.invite`) - Used for receiving
|
|
to-device messages of a given type.
|
|
|
|
These capabilities open up access to the following respective actions, when approved:
|
|
|
|
**`fromWidget` action of `send_to_device`**
|
|
|
|
```json5
|
|
{
|
|
// This is a standardized widget API request.
|
|
"api": "fromWidget",
|
|
"widgetId": "20200827_WidgetExample",
|
|
"requestid": "generated-id-1234",
|
|
"action": "send_to_device", // value defined by this proposal
|
|
"data": {
|
|
// Same structure as the `/sendToDevice` HTTP API request body
|
|
"@target:example.org": {
|
|
"DEVICEID": { // can also be a `*` to denote "all of the user's devices"
|
|
"example_content": "put your real message here"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
The client upon receipt of this will validate that the widget has an appropriate capability to send
|
|
the to-device message. If the widget is approved for such a capability, the client **MUST** encrypt
|
|
the message by default unless the event is already encrypted by the widget (this MSC doesn't provide
|
|
enough API surface for a widget to do this, but in future it might be possible for the widget to
|
|
gain some context of the encryption state for the client and use that to make/manage Olm sessions).
|
|
The encrypted message is then sent as requested to the users/devices using
|
|
[`/sendToDevice`](https://spec.matrix.org/v1.2/client-server-api/#put_matrixclientv3sendtodeviceeventtypetxnid).
|
|
|
|
If the widget doesn't have appropriate permission, or an error occurs anywhere along the send path,
|
|
a standardized widget error response is returned.
|
|
|
|
Under the widget API, a response to all actions is required and takes the shape of repeating the
|
|
request with an added top-level `response` field. This `response` field is empty for this action,
|
|
as shown:
|
|
|
|
```json5
|
|
{
|
|
"api": "fromWidget",
|
|
"widgetId": "20200827_WidgetExample",
|
|
"requestid": "generated-id-1234",
|
|
"action": "send_to_device",
|
|
"data": {
|
|
"@target:example.org": {
|
|
"DEVICEID": {
|
|
"example_content": "put your real message here"
|
|
}
|
|
}
|
|
},
|
|
"response": {}
|
|
}
|
|
```
|
|
|
|
The client *should not* send a response to the action until the server has returned 200 OK itself,
|
|
which might take longer than the default widget API timeout of 10 seconds. Widgets should raise their
|
|
maximum timeout to 60 seconds or more for this action.
|
|
|
|
**`toWidget` action of `send_to_device`**
|
|
|
|
*Note*: It is common practice to name the action in favour of the direction of travel rather than try
|
|
and determine an alternative name. This does mean that there are two `send_to_device` actions: one
|
|
for widget->client and one for client->widget. This section is talking about client->widget.
|
|
|
|
After the client has decrypted all to-device messages it receives, it determines if any widgets should
|
|
be made aware of the contents within. The decrypted event type for the message is used to determine
|
|
if the widget has appropriate capability to see the message.
|
|
|
|
The client should process all to-device messages it can before sending them off to the widget. Even if
|
|
the client does process a message though, it should still send it to the appropriate widgets for
|
|
potential re-processing. This is to avoid a scenario where the host client can no longer reliably
|
|
function, such as if Olm sessions get corrupted or similar.
|
|
|
|
The client should be aware that to-device messages might be seen which the client *could* handle, but
|
|
might not have context on, such as VoIP signaling. The client should not error out if it can't locate
|
|
a matching call, for example.
|
|
|
|
The client SHOULD only send events which were received by the client *after* the session has been
|
|
established with the widget (after the widget's capabilities are negotiated).
|
|
|
|
The request itself looks as follows:
|
|
|
|
```json5
|
|
{
|
|
// This is a standardized widget API request.
|
|
"api": "toWidget", // note that we're sending *to* the widget here
|
|
"widgetId": "20200827_WidgetExample",
|
|
"requestid": "generated-id-1234",
|
|
"action": "send_to_device", // value defined by this proposal
|
|
"data": {
|
|
"type": "m.call.invite",
|
|
"sender": "@source:example.org",
|
|
"encrypted": true,
|
|
"content": {
|
|
// ... as required for the event schema
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
Note that the action only supports a single to-device message at a time. This is for symmetry with
|
|
[MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762).
|
|
|
|
Under the widget API, a response is required from the widget. The widget simply acknowledges the request
|
|
with an empty response object:
|
|
|
|
```json5
|
|
{
|
|
"api": "toWidget",
|
|
"widgetId": "20200827_WidgetExample",
|
|
"requestid": "generated-id-1234",
|
|
"action": "send_to_device",
|
|
"data": {
|
|
"type": "m.call.invite",
|
|
"sender": "@source:example.org",
|
|
"content": {
|
|
// ... as required for the event schema
|
|
}
|
|
},
|
|
"response": {}
|
|
}
|
|
```
|
|
|
|
## Potential issues
|
|
|
|
Due to lack of documentation/spec, conventions for the widget API and its security principles could
|
|
be misunderstood or confusing. This MSC attempts to overly describe these cases where they are at
|
|
risk of being a potential misunderstanding, however readers of the proposal are still encouraged to
|
|
gather as much information as they can before reviewing this proposal.
|
|
|
|
This MSC further pushes forward an idea that the `postMessage` transport for the widget API is the
|
|
way to go, however MSCs like [MSC3009](https://github.com/matrix-org/matrix-spec-proposals/pull/3009)
|
|
explore what it could mean to have a different transport mechanism. This MSC is not tied directly
|
|
to `postMessage` and is instead describing the request/responses used over the widget API - whatever
|
|
transport that might be.
|
|
|
|
## Alternatives
|
|
|
|
As discussed in the introduction of this proposal and on other MSCs, we could expose the client-server
|
|
API more generically to the widget. This causes issues where the client is either forced to parse
|
|
requests like a webserver would to validate that the widget is allowed to make the request, or
|
|
require such a generic capability that widgets would excessively request full read/write access
|
|
from the user without consideration for the impact that might have. As such, we continue to describe
|
|
special-cased actions for the widget API on a case-by-case basis.
|
|
|
|
On other related proposals there's discussion about how a bot could achieve the same function as
|
|
the proposal. While also partially true here, the intent is not to have a game or similar publishing
|
|
events into a room but rather to have a second Matrix client (for all intents and purposes) embedded
|
|
either as a room widget or account widget. A bot precludes the second client from acting on behalf
|
|
of the user who has it open.
|
|
|
|
## Security considerations
|
|
|
|
Because the widget can implicitly decrypt events, it is absolutely imperative that clients
|
|
prompt for permission to use these capabilities even though the capabilities negotiation does not
|
|
require this to be done. Strictly speaking, clients which do not prompt for confirmation from the
|
|
user are frowned upon, however given the intended usecase of VoIP signaling it is reasonable to
|
|
auto-approve some capabilities if the client can verifiably trust the widget is running safe code.
|
|
In general, verifiable trust only comes from the client locking widgets down to specific domains
|
|
or rewriting the widget URL before rendering to something the client controls.
|
|
|
|
This MSC allows widgets to access sensitive parts of the client-server API, and the encryption
|
|
module specifically. If granted permission, a widget could feasibly harvest decryption keys *in clear
|
|
text*. It is strongly encouraged that clients do not auto-approve capabilities for key exchanges
|
|
or similar. In fact, it might even be reasonable for the client to auto-deny instead.
|
|
|
|
This MSC allows a room widget to act at the account level rather than the traditional room level.
|
|
Normally these events would be scoped to the currently active room, however to-device messages are
|
|
not tied to a room. Therefore, the events are exposed as-is to the widget and can be interacted with
|
|
as such.
|
|
|
|
## Unstable prefix
|
|
|
|
While this MSC is not present in the spec, clients and widgets should:
|
|
|
|
* Use `org.matrix.msc3819.` in place of `m.` in all new identifiers of this MSC.
|
|
* Only call/support the `action`s if a widget API version of `org.matrix.msc3819` is advertised.
|
|
|
|
## Dependencies
|
|
|
|
None applicable - this MSC's dependencies have either been approved or are used simply as reference
|
|
material. In practice, widgets should probably be formally in the spec before this MSC gets included.
|