831 lines
38 KiB
Markdown
831 lines
38 KiB
Markdown
# MSC4140: Cancellable delayed events
|
|
|
|
This MSC proposes a mechanism by which a Matrix client can schedule an event (including a state event) to be sent into
|
|
a room at a later time.
|
|
|
|
The client does not have to be running or in contact with the Homeserver at the time that the event is actually sent.
|
|
|
|
Once the event has been scheduled, the user's homeserver is responsible for actually sending the event at the appropriate
|
|
time and then distributing it as normal via federation.
|
|
|
|
<!-- TOC -->
|
|
- [Background and motivation](#background-and-motivation)
|
|
- [Proposal](#proposal)
|
|
- [Scheduling a delayed event](#scheduling-a-delayed-event)
|
|
- [Managing delayed events](#managing-delayed-events)
|
|
- [Getting delayed events](#getting-delayed-events)
|
|
- [On demand](#on-demand)
|
|
- [On push](#on-push)
|
|
- [Homeserver implementation details](#homeserver-implementation-details)
|
|
- [Power levels are evaluated at the point of sending](#power-levels-are-evaluated-at-the-point-of-sending)
|
|
- [Delayed state events are cancelled by a more recent state event](#delayed-state-events-are-cancelled-by-a-more-recent-state-event)
|
|
- [Rate-limiting at the point of sending](#rate-limiting-at-the-point-of-sending)
|
|
- [Use case specific considerations](#use-case-specific-considerations)
|
|
- [MatrixRTC](#matrixrtc)
|
|
- [Background](#background)
|
|
- [How this MSC would be used for MatrixRTC](#how-this-msc-would-be-used-for-matrixrtc)
|
|
- [Self-destructing messages](#self-destructing-messages)
|
|
- [Potential issues](#potential-issues)
|
|
- [Compatibility with Cryptographic Identities](#compatibility-with-cryptographic-identities)
|
|
- [Alternatives](#alternatives)
|
|
- [Delegating delayed events](#delegating-delayed-events)
|
|
- [Batch sending](#batch-sending)
|
|
- [Not reusing the `send`/`state` endpoint](#not-reusing-the-sendstate-endpoint)
|
|
- [Batch delayed events with custom endpoint](#batch-delayed-events-with-custom-endpoint)
|
|
- [Batch Response](#batch-response)
|
|
- [EventId template variable](#eventid-template-variable)
|
|
- [Allocating the event ID at the point of scheduling the send](#allocating-the-event-id-at-the-point-of-scheduling-the-send)
|
|
- [MSC4018 (use client sync loop)](#msc4018-use-client-sync-loop)
|
|
- [Federated delayed events](#federated-delayed-events)
|
|
- [MQTT style Last Will](#mqtt-style-last-will)
|
|
- [`M_INVALID_PARAM` instead of `M_MAX_DELAY_EXCEEDED`](#m_invalid_param-instead-of-m_max_delay_exceeded)
|
|
- [Naming](#naming)
|
|
- [Don't provide a `send` action](#dont-provide-a-send-action)
|
|
- [Use `DELETE` HTTP method for `cancel` action](#use-delete-http-method-for-cancel-action)
|
|
- [[Ab]use typing notifications](#abuse-typing-notifications)
|
|
- [Security considerations](#security-considerations)
|
|
- [Unstable prefix](#unstable-prefix)
|
|
- [Dependencies](#dependencies)
|
|
<!-- /TOC -->
|
|
|
|
## Background and motivation
|
|
|
|
This proposal originates from the needs of VoIP signalling in Matrix:
|
|
|
|
The Client-Server API currently has a [Voice over IP module](https://spec.matrix.org/v1.11/client-server-api/#voice-over-ip)
|
|
that uses room messages to communicate the call state. However, it only allows for calls with two participants.
|
|
|
|
[MSC3401: Native Group VoIP Signalling](https://github.com/matrix-org/matrix-spec-proposals/pull/3401) proposes a scheme
|
|
that allows for more than two participants by using room state events.
|
|
|
|
In this arrangement each device signals its participant in a call by sending a state event that represents the device's
|
|
"membership" of a call. Once the device is no longer in the call, it sends a new state event to update the call state and
|
|
communicate that the device is no longer a member.
|
|
|
|
This works well when the client is running and can send the state events as needed. However, if the client is not able to
|
|
communicate with the homeserver (e.g. the user closes the app or loses connection) the call state is not updated to say
|
|
that the participant has left.
|
|
|
|
The motivation for this MSC is to allow updating call member state events after the user disconnected by allowing to
|
|
schedule/delay/timeout/expire events in a generic way.
|
|
|
|
The ["reliability requirements for the room state"](https://github.com/matrix-org/matrix-spec-proposals/blob/toger5/matrixRTC/proposals/4143-matrix-rtc.md#reliability-requirements-for-the-room-state)
|
|
section of [MSC4143: MatrixRTC](https://github.com/matrix-org/matrix-spec-proposals/pull/4143) has more details on the
|
|
use case.
|
|
|
|
There are numerous possible solution to solve the call member event expiration. They are covered in detail
|
|
in the [Use case specific considerations/MatrixRTC](#use-case-specific-considerations) section, because they are not part
|
|
of this proposal.
|
|
|
|
This proposal enables a Matrix client to schedule a "hangup" state event to be sent after a specified time period.
|
|
The client can then periodically restart the timer whilst it is running. If the client is no longer running
|
|
or able to communicate, then the timer would expire and the homeserver would send the "hangup" event on behalf of the client.
|
|
|
|
Such an arrangement can also be described as a "heartbeat" mechanism. The client sends a "heartbeat" to the homeserver
|
|
in the form of a "restart" of the delayed event to keep the call "alive".
|
|
The homeserver will automatically send the "hangup" if it does not receive a "heartbeat".
|
|
|
|
## Proposal
|
|
|
|
The following operations are added to the client-server API:
|
|
|
|
- Schedule an event to be sent at a later time
|
|
- Get a list of delayed events
|
|
- Restart the timer of a delayed event
|
|
- Send the delayed event immediately
|
|
- Cancel a delayed event so that it is never sent
|
|
|
|
At the point of an event being scheduled the homeserver is [unable to allocate the event ID](#allocating-the-event-id-at-the-point-of-scheduling-the-send).
|
|
Instead, the homeserver allocates a `delay_id` to the scheduled event which is used during the above API operations.
|
|
|
|
### Scheduling a delayed event
|
|
|
|
An optional `delay` query parameter is added to the existing
|
|
[`PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}`](https://spec.matrix.org/v1.11/client-server-api/#put_matrixclientv3roomsroomidsendeventtypetxnid)
|
|
and
|
|
[`PUT /_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId}`](https://spec.matrix.org/v1.11/client-server-api/#put_matrixclientv3roomsroomidstateeventtypestatekey)
|
|
endpoints.
|
|
|
|
The new query parameter is used to configure the event scheduling:
|
|
|
|
- `delay` - Optional number of milliseconds the homeserver should wait before sending the event. If no `delay` is provided,
|
|
the event is sent immediately as normal.
|
|
|
|
The body of the request is the same as it is currently.
|
|
|
|
If a `delay` is provided, the homeserver schedules the event to be sent with the specified delay and responds with a
|
|
`delay_id` field (omitting the `event_id` as it is not available):
|
|
|
|
```http
|
|
200 OK
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"delay_id": "1234567890"
|
|
}
|
|
```
|
|
|
|
The homeserver can optionally enforce a maximum delay duration. If the requested delay exceeds the maximum, the homeserver
|
|
can respond with a [`400`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400) status code
|
|
and a body with a Matrix error code `M_MAX_DELAY_EXCEEDED` and the maximum allowed delay (`max_delay` in milliseconds).
|
|
|
|
For example, the following specifies a maximum delay of 24 hours:
|
|
|
|
```http
|
|
400 Bad Request
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"errcode": "M_MAX_DELAY_EXCEEDED",
|
|
"error": "The requested delay exceeds the allowed maximum.",
|
|
"max_delay": 86400000
|
|
}
|
|
```
|
|
|
|
The homeserver **should** apply rate limiting to the scheduling of delayed events to provide mitigation against the
|
|
[High Volume of Messages](https://spec.matrix.org/v1.11/appendices/#threat-high-volume-of-messages) threat.
|
|
|
|
The homeserver **may** apply a limit on the maximum number of outstanding delayed events in which case the Matrix error code
|
|
`M_MAX_DELAYED_EVENTS_EXCEEDED` can be returned:
|
|
|
|
```http
|
|
400 Bad Request
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"errcode": "M_MAX_DELAYED_EVENTS_EXCEEDED",
|
|
"error": "The maximum number of delayed events has been reached.",
|
|
}
|
|
```
|
|
|
|
### Managing delayed events
|
|
|
|
A new authenticated client-server API endpoint at `POST /_matrix/client/v1/delayed_events/{delay_id}` allows scheduled events
|
|
to be managed.
|
|
|
|
The body of the request is a JSON object containing the following fields:
|
|
|
|
- `action` - The action to take on the delayed event.\
|
|
Must be one of:
|
|
- `send` - Send the delayed event immediately.
|
|
- `cancel` - Cancel the delayed event so that it is never sent.
|
|
- `restart` - Restart the timeout of the delayed event.
|
|
|
|
For example, the following would send the delayed event with delay ID `1234567890` immediately:
|
|
|
|
```http
|
|
POST /_matrix/client/v1/delayed_events/1234567890
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"action": "send"
|
|
}
|
|
```
|
|
|
|
Where the `action` is `send`, the homeserver **should** apply rate limiting to provide mitigation against the
|
|
[High Volume of Messages](https://spec.matrix.org/v1.11/appendices/#threat-high-volume-of-messages) threat.
|
|
|
|
### Getting delayed events
|
|
|
|
#### On demand
|
|
|
|
New authenticated client-server API endpoints `GET /_matrix/client/v1/delayed_events/scheduled` and
|
|
`GET /_matrix/client/v1/delayed_events/finalised` allows clients to get a list of
|
|
all the delayed events owned by the requesting user that have been scheduled to send, have been sent, or failed to be sent.
|
|
|
|
The endpoints accepts a query parameter `from` which is a token that can be used to paginate the list of delayed events as
|
|
per the [pagination convention](https://spec.matrix.org/v1.11/appendices/#pagination). The homeserver can choose a suitable
|
|
page size.
|
|
|
|
The response is a JSON object containing the following fields:
|
|
|
|
- For the `GET /_matrix/client/v1/delayed_events/scheduled` endpoint:
|
|
- `delayed_events` - Required. An array of delayed events that have been scheduled to be sent,
|
|
sorted by `running_since + delay` in increasing order (event that will timeout soonest first).
|
|
- `delay_id` - Required. The ID of the delayed event.
|
|
- `room_id` - Required. The room ID of the delayed event.
|
|
- `type` - Required. The event type of the delayed event.
|
|
- `state_key` - Optional. The state key of the delayed event if it is a state event.
|
|
- `delay` - Required. The delay in milliseconds before the event is to be sent.
|
|
- `running_since` - Required. The timestamp (as Unix time in milliseconds) when the delayed event was scheduled or
|
|
last restarted.
|
|
- `content` - Required. The content of the delayed event. This is the body of the original `PUT` request, not a preview
|
|
of the full event after sending.
|
|
- `next_batch` - Optional. A token that can be used to paginate the list of delayed events.
|
|
|
|
- For the `GET /_matrix/client/v1/delayed_events/finalised` endpoint:
|
|
- `finalised_events` - Required. An array of finalised delayed events, that have either been sent or resulted in an error,
|
|
sorted by `origin_server_ts` in decreasing order (latest finalised event first).
|
|
- `delayed_event` - Required. Describes the original delayed event in the same format as the `delayed_events` array.
|
|
- `outcome`: `"send"|"cancel"`
|
|
- `reason`: `"error"|"action"|"delay"`
|
|
- `error`: Optional Error. A matrix error (as defined by [Standard error response](https://spec.matrix.org/v1.11/client-server-api/#standard-error-response))
|
|
to explain why this event failed to be sent. The Error can either be the `M_CANCELLED_BY_STATE_UPDATE` or any of the
|
|
Errors from the client server send and state endpoints.
|
|
- `event_id` - Optional EventId. The `event_id` this event got in case it was sent.
|
|
- `origin_server_ts` - Optional Timestamp. The timestamp the event was sent.
|
|
- `next_batch` - Optional. A token that can be used to paginate the list of finalised events.
|
|
|
|
The batch size and the amount of terminated events that stay on the homeserver can be chosen, by the homeserver.
|
|
The recommended values are:
|
|
|
|
- `finalised_events` retention: 7 days
|
|
- `finalised_events` batch size: 10
|
|
- `finalised_events` max cached events: 1000
|
|
|
|
There is no guarantee for a client that all events will be available in the
|
|
finalised events list if they exceed the limits of their homeserver.
|
|
Additionally, a homeserver may discard finalised delayed events that have been returned by a
|
|
`GET /_matrix/client/v1/delayed_events/finalised` response.
|
|
|
|
An example for a response to the `GET /_matrix/client/v1/delayed_events/scheduled` endpoint:
|
|
|
|
```http
|
|
200 OK
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"delayed_events": [
|
|
{
|
|
"delay_id": "1234567890",
|
|
"room_id": "!roomid:example.com",
|
|
"type": "m.room.message",
|
|
"delay": 15000,
|
|
"running_since": 1721732853284,
|
|
"content":{
|
|
"msgtype": "m.text",
|
|
"body": "I am now offline"
|
|
}
|
|
},
|
|
{
|
|
"delay_id": "abcdefgh",
|
|
"room_id": "!roomid:example.com",
|
|
"type": "m.call.member",
|
|
"state_key": "@user:example.com_DEVICEID",
|
|
"delay": 5000,
|
|
"running_since": 1721732853284,
|
|
"content":{
|
|
"memberships": []
|
|
}
|
|
}
|
|
],
|
|
"next_batch": "b12345"
|
|
}
|
|
```
|
|
|
|
Unless the delayed event is updated beforehand, the event will be sent after `running_since` + `delay`.
|
|
|
|
This can be used by clients to display events that have been scheduled to be sent in the future.
|
|
|
|
For use cases where the existence of a delayed event is also of interest for other room members
|
|
(e.g. self-destructing messages), it is recommended to include this information in the original/affected event itself.
|
|
|
|
#### On push
|
|
|
|
A new optional key, `finalised_events`, is added to the response body of `/sync`. The shape of its
|
|
value is equivalent to that of the response body of `GET /_matrix/client/v1/delayed_events/finalised`.
|
|
It is an array of the syncing user's delayed events that were sent or failed to be sent after the
|
|
`since` timestamp parameter of the associated `/sync` request, or all of them for full `/sync`s.
|
|
When no such delayed events exist, the `finalised_events` key is absent from the `/sync` response.
|
|
|
|
A new key, `finalised_events`, is defined for `POST /_matrix/client/v3/user/{userId}/filter`.
|
|
Its value is a boolean which, if set to `false`, causes an associated `/sync` response to exclude
|
|
any `finalised_events` key it may have otherwise included.
|
|
|
|
The only delayed events included in `finalised_events` are ones that have been retained by the homeserver,
|
|
as per the same retention policies as for the `GET /_matrix/client/v1/delayed_events/finalised` endpoint.
|
|
Additionally, a homeserver may discard finalised delayed events that have been returned by a `/sync` response.
|
|
|
|
The `finalised_events` key is added to the request bodies of the appservice API `/transactions` endpoint.
|
|
It has the same content as the key for `/sync`, and contains all of the target appservice's delayed events
|
|
that were sent or failed to be sent since the previous transaction.
|
|
|
|
### Homeserver implementation details
|
|
|
|
#### Power levels are evaluated at the point of sending
|
|
|
|
Power levels are evaluated for each event only once the delay has occurred and it will be distributed/inserted into the
|
|
DAG. This implies a delayed event can fail if it violates power levels at the time the delay passes.
|
|
|
|
Conversely, it's also possible to successfully schedule an event that the user has no permission to send at the time of sending.
|
|
If the power level situation has changed at the time the delay passes, the event can even reach the DAG.
|
|
|
|
#### Delayed state events are cancelled by a more recent state event
|
|
|
|
> [!NOTE]
|
|
> Special rule for delayed state events:
|
|
> A delayed event `D` gets cancelled if:
|
|
>
|
|
> - `D` is a state event with key `k` and type `t` from sender `s`.
|
|
> - A new state event `N` with type `t` and key `k` is sent into the room.
|
|
> - The sender of `D` is different to the sender `N`.
|
|
|
|
If a new state event is sent to the same room at the same entry (`event_type`, `state_key` pair) as a delayed event by a
|
|
**different matrix user**, any delayed event for this entry (`event_type`, `state_key` pair) is cancelled.
|
|
|
|
This only happens if its a state update from a different user. If it is from the same user, the delayed event will not get cancelled.
|
|
If the same user is updating the state which has associated delayed events, this user is in control of those delayed events.
|
|
They can just cancel and check the events manually using the `/delayed_events` and the `/delayed_events/scheduled` endpoint.
|
|
|
|
In the case where the delayed event gets cancelled due to a different user updating the same state, there
|
|
is no race condition here since a possible race between timeout and the _new state event_ will always converge to
|
|
the _new state event_:
|
|
|
|
- timeout for _delayed event_ followed by _new state event_: the room state will be updated twice: once by the content of
|
|
the delayed event but later with the content of _new state event_.
|
|
- _new state event_ followed by timeout for _delayed event_: the _new state event_ will cancel the outstanding _delayed event_.
|
|
|
|
The finalised delayed event as represented by the finalised list of the GET endpoint (See:[Getting delayed events](#getting-delayed-events))
|
|
will be stored with the following outcome:
|
|
|
|
```json
|
|
"outcome": "cancel",
|
|
"reason": "error",
|
|
"error": {
|
|
"errorcode": "M_CANCELLED_BY_STATE_UPDATE",
|
|
"error":"The delayed event did not get send because a different user updated the same state event.
|
|
So the scheduled event might change it in an undesired way."}
|
|
```
|
|
|
|
Note that this behaviour does not apply to regular (non-state) events as there is no concept of a (`event_type`, `state_key`)
|
|
pair that could be overwritten.
|
|
|
|
#### Rate-limiting at the point of sending
|
|
|
|
Further to the rate limiting of the API endpoints, the homeserver **should** apply rate limiting to the sending
|
|
of delayed messages at the point that they are inserted into the DAG.
|
|
|
|
This is to provide mitigation against the
|
|
[High Volume of Messages](https://spec.matrix.org/v1.11/appendices/#threat-high-volume-of-messages) threat where a malicious
|
|
actor could schedule a large volume of events ahead of time without exceeding a rate limit on the initial `PUT` request,
|
|
but has specified a `delay` that corresponds to a common point of time in the future.
|
|
|
|
A limit on the maximum number of delayed events that can be outstanding at one time could also provide some mitigation against
|
|
this attack.
|
|
|
|
## Use case specific considerations
|
|
|
|
Delayed events can be used for many different features: tea timers, reminders, or ephemeral events could be implemented
|
|
using delayed events, where clients send room events with
|
|
intentional mentions or a redaction as a delayed event.
|
|
It can even be used to send temporal power levels/mutes or bans.
|
|
|
|
### MatrixRTC
|
|
|
|
In this section, an overview is given how this MSC is used in [MSC4143: MatrixRTC](https://github.com/matrix-org/matrix-spec-proposals/pull/4143)
|
|
and alternative expiration systems are evaluated.
|
|
|
|
#### Background
|
|
|
|
MatrixRTC makes it necessary to have real time information about the current MatrixRTC session.
|
|
To properly display room tiles and header in the room list (or compute a list of ongoing calls), it's required to know:
|
|
|
|
- If there is a running session.
|
|
- What type that session has.
|
|
- Who and how many people are currently participating.
|
|
|
|
A particular delicate situation is that clients are not able to inform others if they lose connection.
|
|
There are numerous approaches to solve such a situation. They split into two categories:
|
|
|
|
- Polling based
|
|
- Ask the users if they are still connected.
|
|
- Ask an RTC backend (SFU) who is connected.
|
|
- Timeout based
|
|
- Update the room state every x seconds.
|
|
This allows clients to check how long an event has not been updated and ignore it if it's expired.
|
|
- Use delayed events with a 10s timeout to send the disconnected from call
|
|
in less then 10s after the user is not anymore pinging the `/delayed_events` endpoint
|
|
(or delegate the disconnect action to a service attached to the SFU).
|
|
- Use the client sync loop as a special case timeout for call member events
|
|
(see [Alternatives/MSC4018 (use client sync loop))](#msc4018-use-client-sync-loop)).
|
|
|
|
Polling based solutions have a large overhead in complexity and network requests on the clients.
|
|
For example:
|
|
|
|
> A room list with 100 rooms where there has been a call before in every room
|
|
> (or there is an ongoing call) would require the client to send a to-device message
|
|
> (or a request to the SFU) to every user that has an active state event to check if
|
|
> they are still online. All this is just to display the room tile properly.
|
|
|
|
For displaying the room list, timeout based approaches are much more reasonable because they allow computing MatrixRTC
|
|
metadata for a room to be synchronous.
|
|
|
|
The current solution updates the room state every X minutes.
|
|
This is not elegant since room state gets repeatedly sent with the same content.
|
|
In large calls, this could result in high traffic and increase the size of the room DAG.
|
|
|
|
A call with 100 call members implies 100 state events every X minutes. X cannot be a
|
|
long duration because
|
|
it is the duration after which the event can be considered expired. Improper
|
|
disconnects would result in the user being displayed as "still in the call" for
|
|
X minutes (which should be as short as possible).
|
|
|
|
Additionally, this approach requires perfect server client time synchronization to compute the expiration.
|
|
This is currently not possible over federation since `unsigned.age` is not available over federation.
|
|
|
|
#### How this MSC would be used for MatrixRTC
|
|
|
|
With this proposal, the client can use delayed events to implement a "heartbeat" mechanism.
|
|
|
|
On joining the call, the client sends a "join" state event as normal to indicate that it is participating:
|
|
|
|
```http
|
|
PUT /_matrix/client/v1/rooms/!wherever:example.com/state/m.call.member/@someone:example.com
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"memberships": [
|
|
{
|
|
...membership data here...
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
Before sending the join event, it also schedules a delayed "hangup" state event with `delay` of around 5-20 seconds that
|
|
marks the end of its participation:
|
|
|
|
```http
|
|
PUT /_matrix/client/v1/rooms/!wherever:example.com/state/m.call.member/@someone:example.com?delay=10000
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"memberships": []
|
|
}
|
|
```
|
|
|
|
Let's say the homeserver returns a `delay_id` of `1234567890`.
|
|
|
|
The client then periodically sends a "heartbeat" in the form of a "restart" of the delayed "hangup" state event to keep
|
|
the call membership "alive".
|
|
|
|
For example it could make the request every 5 seconds (or some other period less than the `delay`):
|
|
|
|
```http
|
|
POST /_matrix/client/v1/delayed_events/1234567890
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"action": "restart"
|
|
}
|
|
```
|
|
|
|
This would have the effect that if the homeserver does not receive a "heartbeat" from the client for 10 seconds, then
|
|
it will automatically send the "hangup" state event for the client.
|
|
|
|
Since the delayed event is sent first, a client can guarantee (at the time they are sending
|
|
the join event) that it will eventually leave.
|
|
|
|
### Self-destructing messages
|
|
|
|
This MSC also allows an implementation of "self-destructing" messages using redaction:
|
|
|
|
First send (or generate the PDU when
|
|
[MSC4080: Cryptographic Identities](https://github.com/matrix-org/matrix-spec-proposals/pull/4080)
|
|
is available):
|
|
`PUT /_matrix/client/v1/rooms/{roomId}/send/m.room.message/{txnId}`
|
|
|
|
```jsonc
|
|
{
|
|
"msgtype": "m.text",
|
|
"body": "this message will self-redact in 10 minutes"
|
|
}
|
|
```
|
|
|
|
then send:
|
|
`PUT /_matrix/client/v1/rooms/{roomId}/send/m.room.redaction/{txnId}?delay=600000`
|
|
|
|
```jsonc
|
|
{
|
|
"redacts": "{event_id}"
|
|
}
|
|
```
|
|
|
|
This would redact the message with content: `"m.text": "my msg"` after 10 minutes.
|
|
|
|
## Potential issues
|
|
|
|
### Compatibility with Cryptographic Identities
|
|
|
|
Ideally, this proposal should be compatible with other proposals such as
|
|
[MSC4080: Cryptographic Identities](https://github.com/matrix-org/matrix-spec-proposals/pull/4080) which introduce mechanisms
|
|
to allow the recipient of an event to determine whether it was sent by a client as opposed to have been spoofed/injected
|
|
by a malicious homeserver.
|
|
|
|
In the context of this proposal, the delayed events should be signed with the same cryptographic identity as the client
|
|
that scheduled them.
|
|
|
|
This means that the content of the original scheduled event must be sent "as is" without modification by the homeserver.
|
|
The consequence is an implementation detail that client developers must be aware of: if the content of the delayed
|
|
event contains a timestamp, then it would be the timestamp of when the event was originally scheduled rather than
|
|
anything later.
|
|
|
|
However, the `origin_server_ts` of the delayed event should be the time that the event is actually sent by the homeserver.
|
|
|
|
This is a general problem that arises with the introduction
|
|
of [Cryptographic Identities](https://github.com/matrix-org/matrix-spec-proposals/pull/4080).
|
|
A user can intentionally, or caused by network conditions, delay the signing and sending of an event.
|
|
A possible solution would be the introduction of a `signing_ts` (in the signed section) and keep the `origin_server_ts`
|
|
in the unsigned section.
|
|
Both are reasonable data points that clients might want to use.
|
|
This would solve issues related to delayed events since
|
|
it would make it transparent to clients, when an event was scheduled and when it was distributed over federation.
|
|
|
|
## Alternatives
|
|
|
|
### Delegating delayed events
|
|
|
|
It is useful for external services to also interact with delayed events. If a client disconnects, an external service can
|
|
be the best source to send the delayed event/"last will".
|
|
|
|
This is not covered in this MSC but could be realized with scoped access tokens.
|
|
A scoped token that only allows to interact with the `delayed_events` endpoint and only with a subset of `delay_id`s
|
|
would be used.
|
|
|
|
With this, an SFU that tracks the current client connection state could be given the power to control the delayed event.
|
|
The client would share the scoped token and the required details, so that the SFU can call the
|
|
`refresh` endpoint while a user is connected
|
|
and can call the delayed event `send` request once the user disconnects
|
|
(using a `{"action": "restart"}` and a `{"action": "send"}` `/delayed_events` request.).
|
|
This way, the SFU can be used as the source of truth for the call member room state event without knowing anything about
|
|
the Matrix call.
|
|
|
|
Since the SFU has a much lower chance of running into a network issue,
|
|
`{"action": "restart"}` calls may be sent much more infrequently.
|
|
Instead of calling the `/delayed_events` endpoint every couple of seconds, a delayed event's
|
|
timeout can be set to be long (e.g. 6 hours), as the SFU can be expected to not forget sending the `{"action": "send"}` action
|
|
when it detects a disconnecting client.
|
|
|
|
### Batch sending
|
|
|
|
In some scenarios it is important to allow to send an event with an associated
|
|
delay at the same time.
|
|
|
|
- One example would be redacting an event. It only makes sense to redact the event if it exists.
|
|
It might be important to have the guarantee that the delayed redact is received
|
|
by the server at the time where the original message is sent.
|
|
- In the case of a state event, a user might want to set the state to `A` and after a
|
|
timeout change it back to `{}`. By using two separate requests, sending `A` could work,
|
|
but the event with content `{}` could fail. The state would not automatically
|
|
reset to `{}`.
|
|
|
|
For this use case, batch sending of multiple delayed events would be desired.
|
|
|
|
Batch sending is not included in the proposal of this MSC however since batch sending should
|
|
become a generic Matrix concept as proposed with `/send_pdus`. (see: [MSC4080: Cryptographic Identities](https://github.com/matrix-org/matrix-spec-proposals/pull/4080))
|
|
|
|
[MSC2716: Incrementally importing history into existing rooms](https://github.com/matrix-org/matrix-spec-proposals/pull/2716)
|
|
already proposes a `batch_send` endpoint. However, it is limited to application services and focuses on historic
|
|
data. Since the additional capability to use a template `event_id` parameter is also needed, this probably is not a good fit.
|
|
|
|
### Not reusing the `send`/`state` endpoint
|
|
|
|
Alternatively, new endpoints could be introduced to not overload the `send` and `state` endpoint.
|
|
Those endpoints could be called:
|
|
|
|
`PUT /_matrix/client/v1/rooms/{roomId}/send_delayed_event/{eventType}/{txnId}?delay={delay_ms}`
|
|
|
|
`PUT /_matrix/client/v1/rooms/{roomId}/state_delayed_event/{eventType}/{stateKey}?delay={delay_ms}`
|
|
|
|
This would allow the response for the `send` and `state` endpoints to remain as they are currently,
|
|
and to have a different return type for the new `send_delayed_event` and `state_delayed_event` endpoints.
|
|
|
|
### Allocating the event ID at the point of scheduling the send
|
|
|
|
This was considered, but when sending a delayed event the `event_id` is not yet available:
|
|
|
|
The Matrix spec says that the `event_id` must use the [reference hash](https://spec.matrix.org/v1.10/rooms/v11/#event-ids)
|
|
which is [calculated from the fields](https://spec.matrix.org/v1.10/server-server-api/#calculating-the-reference-hash-for-an-event)
|
|
of an event including the `origin_server_timestamp` as defined in [this list](https://spec.matrix.org/v1.10/rooms/v11/#client-considerations)
|
|
|
|
Since the `origin_server_timestamp` should be the timestamp the event has when entering the DAG (required for call
|
|
duration computation), the `event_id` cannot be computed when using the `send` endpoint before the delayed event has resolved.
|
|
|
|
### MSC4018 (use client sync loop)
|
|
|
|
[MSC4018: Reliable call membership](https://github.com/matrix-org/matrix-spec-proposals/pull/4018) also
|
|
proposes a way to make call memberships reliable. It uses the client sync loop as
|
|
an indicator to determine if the event is expired, instead of letting the SFU
|
|
inform about the call termination or using the call app ping/refresh loop as proposed earlier in this MSC.
|
|
|
|
The advantage is that this does not require introducing a new ping system
|
|
(as is proposed here by using the `delayed_events` restart action).
|
|
Though with cryptographic identities, the client needs to create the leave event.
|
|
|
|
The timeout for syncs are much slower than what would be desirable (30s vs 5s).
|
|
|
|
With a widget implementation for calls, it cannot be guaranteed that the widget is running during the sync loop.
|
|
So one either has to move the hangup logic to the hosting client or let the widget run all the time.
|
|
|
|
A dedicated ping (independent to the sync loop) is more flexible and allows for the widget to
|
|
execute the timer restart.
|
|
If the widget dies, the call membership will disconnect.
|
|
|
|
Additionally, the specification should not include specific
|
|
custom server rules if possible.
|
|
Sending an event on behalf of a user based on the client sync loop if there is an event with a specific type and specific
|
|
content is quite a server-specific behaviour, and also would not work well with encrypted state events and cryptographic
|
|
identities.
|
|
This proposal is a general behaviour valid for all event types.
|
|
|
|
### Federated delayed events
|
|
|
|
Delayed events could be sent over federation immediately and then have the receiving servers process (sent down to clients)
|
|
them at the appropriate time.
|
|
|
|
Downsides of this approach that have been considered are that:
|
|
|
|
- individual "heartbeats"/restarts would need to distributed via federation, meaning more traffic and processing
|
|
to be done.
|
|
- if any homeservers missed the federated "heartbeat"/restart message, then they might decide that the event is visible
|
|
to clients whereas
|
|
other homeservers might have received it and come to a different conclusion. If the event was later cancelled then
|
|
resolving the inconsistency feels more complex than if the event was never sent in the first place.
|
|
|
|
[MSC3277: Scheduled messages](https://github.com/matrix-org/matrix-spec-proposals/pull/3277) proposes a similar feature
|
|
and there is an extensive analysis of the pros and cons of this MSC vs MSC3277
|
|
[here](https://github.com/matrix-org/matrix-spec-proposals/pull/4140#discussion_r1653083566).
|
|
|
|
If it's not needed to allow modification of a delayed event after it has been scheduled, there is a benefit in
|
|
federating the scheduled event (adding it to the DAG immediately). It increases resilience: the sender's homeserver can
|
|
disconnect and the delayed message still will enter non-soft-failed state (will be sent).
|
|
|
|
However, for the MatrixRTC use case it's required to be able to modify the event after it has been scheduled. As such,
|
|
this approach has been discounted.
|
|
|
|
### MQTT style Last Will
|
|
|
|
[MQTT](https://mqtt.org/) has the concept of a Will Message that is published by the server when a client disconnects.
|
|
|
|
The client can set a Will Message when it connects to the server. If the client disconnects unexpectedly, the server will
|
|
publish the Will Message if the client is not back online within a specified time.
|
|
|
|
A similar concept could be applied to Matrix by having the client specify a set of "Last Will" events and have the
|
|
homeserver trigger them if the client (possibly identified by device ID) does not send an API request within a specified
|
|
time.
|
|
|
|
The main differentiator is that this type of approach might use the sync loop as the "heartbeat" equivalent similar to
|
|
[MSC4018](https://github.com/matrix-org/matrix-spec-proposals/pull/4018).
|
|
|
|
A benefit compared to this proposal is that theoretically there would be no additional network traffic overhead.
|
|
|
|
Some complications are:
|
|
|
|
- in order to avoid additional network traffic, the homeserver would need to proactively realise that a connection
|
|
has dropped. Depending on the network/load balancer stack this might be problematic.
|
|
- as an alternative, the client could reduce the long poll timeout (from a typical 30s down to, say, 5s) which would
|
|
result in a traffic increase.
|
|
- As syncing is a per-client concept, the MatrixRTC app has to either run in the same process as the client so that a
|
|
MatrixRTC app failure triggers the client Last Will or the client has to observe the MatrixRTC app and simulate the Last
|
|
Will if the MatrixRTC app fails.
|
|
|
|
### `M_INVALID_PARAM` instead of `M_MAX_DELAY_EXCEEDED`
|
|
|
|
The existing `M_INVALID_PARAM` error code could be used instead of introducing a new error code `M_MAX_DELAY_EXCEEDED`.
|
|
|
|
### Naming
|
|
|
|
The following alternative names for this concept are considered:
|
|
|
|
- Future
|
|
- DelayedEvents
|
|
- PostponedEvents
|
|
- LastWill
|
|
|
|
### Don't provide a `send` action
|
|
|
|
Instead of providing a `send` action for delayed events, the client could cancel the outstanding delayed event and send
|
|
a new non-delayed event instead.
|
|
|
|
This would simplify the API, but it's less efficient since the client would have to send two requests instead of one.
|
|
|
|
### Use `DELETE` HTTP method for `cancel` action
|
|
|
|
Instead of providing a `cancel` action for delayed events, the client could send a `DELETE` request to the same endpoint.
|
|
|
|
This feels more elegant, but it doesn't feel like a good suggestion for how the other actions are mapped.
|
|
|
|
### [Ab]use typing notifications
|
|
|
|
Some exploration of using typing notifications to indicate that a user is still connected to a call was done.
|
|
|
|
The idea of extending [MSC3038: Typed typing notifications](https://github.com/matrix-org/matrix-spec-proposals/pull/3038)
|
|
to allow for additional meta data (like device ID and call ID) was considered.
|
|
|
|
A perceived benefit was that if the delay events were federated, then the typing notification EDUs might provide an
|
|
efficient transport.
|
|
|
|
However, as the conclusion was to [not federate the delayed events](#federated-delayed-events), this approach was
|
|
discounted in favour of a dedicated endpoint.
|
|
|
|
### Alternative to `running_since` field
|
|
|
|
Some alternatives for the `running_since` field on the `GET` response are:
|
|
|
|
- `delaying_from`
|
|
- `delayed_since`
|
|
- `delaying_since`
|
|
- `last_restart` - but this feels less clear than `running_since` for a delayed event that hasn't been restarted
|
|
|
|
## Security considerations
|
|
|
|
All new endpoints are authenticated.
|
|
|
|
Servers **should** impose a maximum timeout value for delay timeouts of not more than a month.
|
|
|
|
As described [above](#power-levels-are-evaluated-at-the-point-of-sending), the homeserver **must** evaluate and enforce the
|
|
power levels at the time of the delayed event being sent (i.e. added to the DAG).
|
|
|
|
This has the risk that this feature could be used by a malicious actor to circumvent existing rate limiting measures which
|
|
corresponds to the [High Volume of Messages](https://spec.matrix.org/v1.11/appendices/#threat-high-volume-of-messages)
|
|
threat. The homeserver **should** apply rate-limiting to both the scheduling of delayed events and the later sending to
|
|
mitigate this risk.
|
|
|
|
## Unstable prefix
|
|
|
|
Whilst the MSC is in the proposal stage, the following should be used:
|
|
|
|
- `org.matrix.msc4140.delay` should be used instead of the `delay` query parameter.
|
|
- `POST /_matrix/client/unstable/org.matrix.msc4140/delayed_events/{delay_id}` should be used instead of
|
|
the `POST /_matrix/client/v1/delayed_events/{delay_id}` endpoint.
|
|
- `GET /_matrix/client/unstable/org.matrix.msc4140/delayed_events` should be used instead of
|
|
the `GET /_matrix/client/v1/delayed_events` endpoint.
|
|
- `org.matrix.msc4140.finalised_events` should be used as keys of `/sync`, `/transactions`, and
|
|
`/filter` instead of `finalised_events`.
|
|
- The `M_UNKNOWN` `errcode` should be used instead of `M_MAX_DELAY_EXCEEDED` as follows:
|
|
|
|
```json
|
|
{
|
|
"errcode": "M_UNKNOWN",
|
|
"error": "The requested delay exceeds the allowed maximum.",
|
|
"org.matrix.msc4140.errcode": "M_MAX_DELAY_EXCEEDED",
|
|
"org.matrix.msc4140.max_delay": 86400000
|
|
}
|
|
```
|
|
|
|
instead of:
|
|
|
|
```json
|
|
{
|
|
"errcode": "M_MAX_DELAY_EXCEEDED",
|
|
"error": "The requested delay exceeds the allowed maximum.",
|
|
"max_delay": 86400000
|
|
}
|
|
```
|
|
|
|
- The `M_UNKNOWN` `errcode` should be used instead of `M_MAX_DELAYED_EVENTS_EXCEEDED` as follows:
|
|
|
|
```json
|
|
{
|
|
"errcode": "M_UNKNOWN",
|
|
"error": "The maximum number of delayed events has been reached.",
|
|
"org.matrix.msc4140.errcode": "M_MAX_DELAYED_EVENTS_EXCEEDED"
|
|
}
|
|
```
|
|
|
|
instead of:
|
|
|
|
```json
|
|
{
|
|
"errcode": "M_MAX_DELAYED_EVENTS_EXCEEDED",
|
|
"error": "The maximum number of delayed events has been reached."
|
|
}
|
|
```
|
|
|
|
- The `M_UNKNOWN` `errcode` should be used instead of `M_CANCELLED_BY_STATE_UPDATE` as follows:
|
|
|
|
```json
|
|
{
|
|
"errcode": "M_UNKNOWN",
|
|
"org.matrix.msc4140.errcode": "M_CANCELLED_BY_STATE_UPDATE",
|
|
"error":"The delayed event did not get send because a different user updated the same state event.
|
|
So the scheduled event might change it in an undesired way."
|
|
}
|
|
```
|
|
|
|
instead of:
|
|
|
|
```json
|
|
{
|
|
"errcode": "M_CANCELLED_BY_STATE_UPDATE",
|
|
"error":"The delayed event did not get send because a different user updated the same state event.
|
|
So the scheduled event might change it in an undesired way."
|
|
}
|
|
```
|
|
|
|
Additionally, the feature is to be advertised as an unstable feature in the `GET /_matrix/client/versions` response, with
|
|
the key `org.matrix.msc4140` set to `true`. So, the response could then look as follows:
|
|
|
|
```json
|
|
{
|
|
"versions": ["..."],
|
|
"unstable_features": {
|
|
"org.matrix.msc4140": true
|
|
}
|
|
}
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
None.
|