matrix-doc/proposals/4140-delayed-events-futures.md

38 KiB

MSC4140: Cancellable delayed events

This MSC proposes a mechanism by which a Matrix client can schedule an event (including a state event) to be sent into a room at a later time.

The client does not have to be running or in contact with the Homeserver at the time that the event is actually sent.

Once the event has been scheduled, the user's homeserver is responsible for actually sending the event at the appropriate time and then distributing it as normal via federation.

Background and motivation

This proposal originates from the needs of VoIP signalling in Matrix:

The Client-Server API currently has a Voice over IP module that uses room messages to communicate the call state. However, it only allows for calls with two participants.

MSC3401: Native Group VoIP Signalling proposes a scheme that allows for more than two participants by using room state events.

In this arrangement each device signals its participant in a call by sending a state event that represents the device's "membership" of a call. Once the device is no longer in the call, it sends a new state event to update the call state and communicate that the device is no longer a member.

This works well when the client is running and can send the state events as needed. However, if the client is not able to communicate with the homeserver (e.g. the user closes the app or loses connection) the call state is not updated to say that the participant has left.

The motivation for this MSC is to allow updating call member state events after the user disconnected by allowing to schedule/delay/timeout/expire events in a generic way.

The "reliability requirements for the room state" section of MSC4143: MatrixRTC has more details on the use case.

There are numerous possible solution to solve the call member event expiration. They are covered in detail in the Use case specific considerations/MatrixRTC section, because they are not part of this proposal.

This proposal enables a Matrix client to schedule a "hangup" state event to be sent after a specified time period. The client can then periodically restart the timer whilst it is running. If the client is no longer running or able to communicate, then the timer would expire and the homeserver would send the "hangup" event on behalf of the client.

Such an arrangement can also be described as a "heartbeat" mechanism. The client sends a "heartbeat" to the homeserver in the form of a "restart" of the delayed event to keep the call "alive". The homeserver will automatically send the "hangup" if it does not receive a "heartbeat".

Proposal

The following operations are added to the client-server API:

  • Schedule an event to be sent at a later time
  • Get a list of delayed events
  • Restart the timer of a delayed event
  • Send the delayed event immediately
  • Cancel a delayed event so that it is never sent

At the point of an event being scheduled the homeserver is unable to allocate the event ID. Instead, the homeserver allocates a delay_id to the scheduled event which is used during the above API operations.

Scheduling a delayed event

An optional delay query parameter is added to the existing PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey} and PUT /_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId} endpoints.

The new query parameter is used to configure the event scheduling:

  • delay - Optional number of milliseconds the homeserver should wait before sending the event. If no delay is provided, the event is sent immediately as normal.

The body of the request is the same as it is currently.

If a delay is provided, the homeserver schedules the event to be sent with the specified delay and responds with a delay_id field (omitting the event_id as it is not available):

200 OK
Content-Type: application/json

{
  "delay_id": "1234567890"
}

The homeserver can optionally enforce a maximum delay duration. If the requested delay exceeds the maximum, the homeserver can respond with a 400 status code and a body with a Matrix error code M_MAX_DELAY_EXCEEDED and the maximum allowed delay (max_delay in milliseconds).

For example, the following specifies a maximum delay of 24 hours:

400 Bad Request
Content-Type: application/json

{
  "errcode": "M_MAX_DELAY_EXCEEDED",
  "error": "The requested delay exceeds the allowed maximum.",
  "max_delay": 86400000
}

The homeserver should apply rate limiting to the scheduling of delayed events to provide mitigation against the High Volume of Messages threat.

The homeserver may apply a limit on the maximum number of outstanding delayed events in which case the Matrix error code M_MAX_DELAYED_EVENTS_EXCEEDED can be returned:

400 Bad Request
Content-Type: application/json

{
  "errcode": "M_MAX_DELAYED_EVENTS_EXCEEDED",
  "error": "The maximum number of delayed events has been reached.",
}

Managing delayed events

A new authenticated client-server API endpoint at POST /_matrix/client/v1/delayed_events/{delay_id} allows scheduled events to be managed.

The body of the request is a JSON object containing the following fields:

  • action - The action to take on the delayed event.
    Must be one of:
    • send - Send the delayed event immediately.
    • cancel - Cancel the delayed event so that it is never sent.
    • restart - Restart the timeout of the delayed event.

For example, the following would send the delayed event with delay ID 1234567890 immediately:

POST /_matrix/client/v1/delayed_events/1234567890
Content-Type: application/json

{
  "action": "send"
}

Where the action is send, the homeserver should apply rate limiting to provide mitigation against the High Volume of Messages threat.

Getting delayed events

On demand

New authenticated client-server API endpoints GET /_matrix/client/v1/delayed_events/scheduled and GET /_matrix/client/v1/delayed_events/finalised allows clients to get a list of all the delayed events owned by the requesting user that have been scheduled to send, have been sent, or failed to be sent.

The endpoints accepts a query parameter from which is a token that can be used to paginate the list of delayed events as per the pagination convention. The homeserver can choose a suitable page size.

The response is a JSON object containing the following fields:

  • For the GET /_matrix/client/v1/delayed_events/scheduled endpoint:

    • delayed_events - Required. An array of delayed events that have been scheduled to be sent, sorted by running_since + delay in increasing order (event that will timeout soonest first).
      • delay_id - Required. The ID of the delayed event.
      • room_id - Required. The room ID of the delayed event.
      • type - Required. The event type of the delayed event.
      • state_key - Optional. The state key of the delayed event if it is a state event.
      • delay - Required. The delay in milliseconds before the event is to be sent.
      • running_since - Required. The timestamp (as Unix time in milliseconds) when the delayed event was scheduled or last restarted.
      • content - Required. The content of the delayed event. This is the body of the original PUT request, not a preview of the full event after sending.
    • next_batch - Optional. A token that can be used to paginate the list of delayed events.
  • For the GET /_matrix/client/v1/delayed_events/finalised endpoint:

    • finalised_events - Required. An array of finalised delayed events, that have either been sent or resulted in an error, sorted by origin_server_ts in decreasing order (latest finalised event first).
      • delayed_event - Required. Describes the original delayed event in the same format as the delayed_events array.
      • outcome: "send"|"cancel"
      • reason: "error"|"action"|"delay"
      • error: Optional Error. A matrix error (as defined by Standard error response) to explain why this event failed to be sent. The Error can either be the M_CANCELLED_BY_STATE_UPDATE or any of the Errors from the client server send and state endpoints.
      • event_id - Optional EventId. The event_id this event got in case it was sent.
      • origin_server_ts - Optional Timestamp. The timestamp the event was sent.
    • next_batch - Optional. A token that can be used to paginate the list of finalised events.

The batch size and the amount of terminated events that stay on the homeserver can be chosen, by the homeserver. The recommended values are:

  • finalised_events retention: 7 days
  • finalised_events batch size: 10
  • finalised_events max cached events: 1000

There is no guarantee for a client that all events will be available in the finalised events list if they exceed the limits of their homeserver. Additionally, a homeserver may discard finalised delayed events that have been returned by a GET /_matrix/client/v1/delayed_events/finalised response.

An example for a response to the GET /_matrix/client/v1/delayed_events/scheduled endpoint:

200 OK
Content-Type: application/json

{
  "delayed_events": [
    {
      "delay_id": "1234567890",
      "room_id": "!roomid:example.com",
      "type": "m.room.message",
      "delay": 15000,
      "running_since": 1721732853284,
      "content":{
        "msgtype": "m.text",
        "body": "I am now offline"
      }
    },
    {
      "delay_id": "abcdefgh",
      "room_id": "!roomid:example.com",
      "type": "m.call.member",
      "state_key": "@user:example.com_DEVICEID",
      "delay": 5000,
      "running_since": 1721732853284,
      "content":{
        "memberships": []
      }
    }
  ],
  "next_batch": "b12345"
}

Unless the delayed event is updated beforehand, the event will be sent after running_since + delay.

This can be used by clients to display events that have been scheduled to be sent in the future.

For use cases where the existence of a delayed event is also of interest for other room members (e.g. self-destructing messages), it is recommended to include this information in the original/affected event itself.

On push

A new optional key, finalised_events, is added to the response body of /sync. The shape of its value is equivalent to that of the response body of GET /_matrix/client/v1/delayed_events/finalised. It is an array of the syncing user's delayed events that were sent or failed to be sent after the since timestamp parameter of the associated /sync request, or all of them for full /syncs. When no such delayed events exist, the finalised_events key is absent from the /sync response.

A new key, finalised_events, is defined for POST /_matrix/client/v3/user/{userId}/filter. Its value is a boolean which, if set to false, causes an associated /sync response to exclude any finalised_events key it may have otherwise included.

The only delayed events included in finalised_events are ones that have been retained by the homeserver, as per the same retention policies as for the GET /_matrix/client/v1/delayed_events/finalised endpoint. Additionally, a homeserver may discard finalised delayed events that have been returned by a /sync response.

The finalised_events key is added to the request bodies of the appservice API /transactions endpoint. It has the same content as the key for /sync, and contains all of the target appservice's delayed events that were sent or failed to be sent since the previous transaction.

Homeserver implementation details

Power levels are evaluated at the point of sending

Power levels are evaluated for each event only once the delay has occurred and it will be distributed/inserted into the DAG. This implies a delayed event can fail if it violates power levels at the time the delay passes.

Conversely, it's also possible to successfully schedule an event that the user has no permission to send at the time of sending. If the power level situation has changed at the time the delay passes, the event can even reach the DAG.

Delayed state events are cancelled by a more recent state event

[!NOTE] Special rule for delayed state events: A delayed event D gets cancelled if:

  • D is a state event with key k and type t from sender s.
  • A new state event N with type t and key k is sent into the room.
  • The sender of D is different to the sender N.

If a new state event is sent to the same room at the same entry (event_type, state_key pair) as a delayed event by a different matrix user, any delayed event for this entry (event_type, state_key pair) is cancelled.

This only happens if its a state update from a different user. If it is from the same user, the delayed event will not get cancelled. If the same user is updating the state which has associated delayed events, this user is in control of those delayed events. They can just cancel and check the events manually using the /delayed_events and the /delayed_events/scheduled endpoint.

In the case where the delayed event gets cancelled due to a different user updating the same state, there is no race condition here since a possible race between timeout and the new state event will always converge to the new state event:

  • timeout for delayed event followed by new state event: the room state will be updated twice: once by the content of the delayed event but later with the content of new state event.
  • new state event followed by timeout for delayed event: the new state event will cancel the outstanding delayed event.

The finalised delayed event as represented by the finalised list of the GET endpoint (See:Getting delayed events) will be stored with the following outcome:

"outcome": "cancel", 
"reason": "error", 
"error": {
  "errorcode": "M_CANCELLED_BY_STATE_UPDATE",
  "error":"The delayed event did not get send because a different user updated the same state event.
  So the scheduled event might change it in an undesired way."}

Note that this behaviour does not apply to regular (non-state) events as there is no concept of a (event_type, state_key) pair that could be overwritten.

Rate-limiting at the point of sending

Further to the rate limiting of the API endpoints, the homeserver should apply rate limiting to the sending of delayed messages at the point that they are inserted into the DAG.

This is to provide mitigation against the High Volume of Messages threat where a malicious actor could schedule a large volume of events ahead of time without exceeding a rate limit on the initial PUT request, but has specified a delay that corresponds to a common point of time in the future.

A limit on the maximum number of delayed events that can be outstanding at one time could also provide some mitigation against this attack.

Use case specific considerations

Delayed events can be used for many different features: tea timers, reminders, or ephemeral events could be implemented using delayed events, where clients send room events with intentional mentions or a redaction as a delayed event. It can even be used to send temporal power levels/mutes or bans.

MatrixRTC

In this section, an overview is given how this MSC is used in MSC4143: MatrixRTC and alternative expiration systems are evaluated.

Background

MatrixRTC makes it necessary to have real time information about the current MatrixRTC session. To properly display room tiles and header in the room list (or compute a list of ongoing calls), it's required to know:

  • If there is a running session.
  • What type that session has.
  • Who and how many people are currently participating.

A particular delicate situation is that clients are not able to inform others if they lose connection. There are numerous approaches to solve such a situation. They split into two categories:

  • Polling based
    • Ask the users if they are still connected.
    • Ask an RTC backend (SFU) who is connected.
  • Timeout based
    • Update the room state every x seconds. This allows clients to check how long an event has not been updated and ignore it if it's expired.
    • Use delayed events with a 10s timeout to send the disconnected from call in less then 10s after the user is not anymore pinging the /delayed_events endpoint (or delegate the disconnect action to a service attached to the SFU).
    • Use the client sync loop as a special case timeout for call member events (see Alternatives/MSC4018 (use client sync loop))).

Polling based solutions have a large overhead in complexity and network requests on the clients. For example:

A room list with 100 rooms where there has been a call before in every room (or there is an ongoing call) would require the client to send a to-device message (or a request to the SFU) to every user that has an active state event to check if they are still online. All this is just to display the room tile properly.

For displaying the room list, timeout based approaches are much more reasonable because they allow computing MatrixRTC metadata for a room to be synchronous.

The current solution updates the room state every X minutes. This is not elegant since room state gets repeatedly sent with the same content. In large calls, this could result in high traffic and increase the size of the room DAG.

A call with 100 call members implies 100 state events every X minutes. X cannot be a long duration because it is the duration after which the event can be considered expired. Improper disconnects would result in the user being displayed as "still in the call" for X minutes (which should be as short as possible).

Additionally, this approach requires perfect server client time synchronization to compute the expiration. This is currently not possible over federation since unsigned.age is not available over federation.

How this MSC would be used for MatrixRTC

With this proposal, the client can use delayed events to implement a "heartbeat" mechanism.

On joining the call, the client sends a "join" state event as normal to indicate that it is participating:

PUT /_matrix/client/v1/rooms/!wherever:example.com/state/m.call.member/@someone:example.com
Content-Type: application/json

{
  "memberships": [
    {
      ...membership data here...
    }
  ]
}

Before sending the join event, it also schedules a delayed "hangup" state event with delay of around 5-20 seconds that marks the end of its participation:

PUT /_matrix/client/v1/rooms/!wherever:example.com/state/m.call.member/@someone:example.com?delay=10000
Content-Type: application/json

{
  "memberships": []
}

Let's say the homeserver returns a delay_id of 1234567890.

The client then periodically sends a "heartbeat" in the form of a "restart" of the delayed "hangup" state event to keep the call membership "alive".

For example it could make the request every 5 seconds (or some other period less than the delay):

POST /_matrix/client/v1/delayed_events/1234567890
Content-Type: application/json

{
  "action": "restart"
}

This would have the effect that if the homeserver does not receive a "heartbeat" from the client for 10 seconds, then it will automatically send the "hangup" state event for the client.

Since the delayed event is sent first, a client can guarantee (at the time they are sending the join event) that it will eventually leave.

Self-destructing messages

This MSC also allows an implementation of "self-destructing" messages using redaction:

First send (or generate the PDU when MSC4080: Cryptographic Identities is available): PUT /_matrix/client/v1/rooms/{roomId}/send/m.room.message/{txnId}

{
  "msgtype": "m.text",
  "body": "this message will self-redact in 10 minutes"
}

then send: PUT /_matrix/client/v1/rooms/{roomId}/send/m.room.redaction/{txnId}?delay=600000

{
  "redacts": "{event_id}"
}

This would redact the message with content: "m.text": "my msg" after 10 minutes.

Potential issues

Compatibility with Cryptographic Identities

Ideally, this proposal should be compatible with other proposals such as MSC4080: Cryptographic Identities which introduce mechanisms to allow the recipient of an event to determine whether it was sent by a client as opposed to have been spoofed/injected by a malicious homeserver.

In the context of this proposal, the delayed events should be signed with the same cryptographic identity as the client that scheduled them.

This means that the content of the original scheduled event must be sent "as is" without modification by the homeserver. The consequence is an implementation detail that client developers must be aware of: if the content of the delayed event contains a timestamp, then it would be the timestamp of when the event was originally scheduled rather than anything later.

However, the origin_server_ts of the delayed event should be the time that the event is actually sent by the homeserver.

This is a general problem that arises with the introduction of Cryptographic Identities. A user can intentionally, or caused by network conditions, delay the signing and sending of an event. A possible solution would be the introduction of a signing_ts (in the signed section) and keep the origin_server_ts in the unsigned section. Both are reasonable data points that clients might want to use. This would solve issues related to delayed events since it would make it transparent to clients, when an event was scheduled and when it was distributed over federation.

Alternatives

Delegating delayed events

It is useful for external services to also interact with delayed events. If a client disconnects, an external service can be the best source to send the delayed event/"last will".

This is not covered in this MSC but could be realized with scoped access tokens. A scoped token that only allows to interact with the delayed_events endpoint and only with a subset of delay_ids would be used.

With this, an SFU that tracks the current client connection state could be given the power to control the delayed event. The client would share the scoped token and the required details, so that the SFU can call the refresh endpoint while a user is connected and can call the delayed event send request once the user disconnects (using a {"action": "restart"} and a {"action": "send"} /delayed_events request.). This way, the SFU can be used as the source of truth for the call member room state event without knowing anything about the Matrix call.

Since the SFU has a much lower chance of running into a network issue, {"action": "restart"} calls may be sent much more infrequently. Instead of calling the /delayed_events endpoint every couple of seconds, a delayed event's timeout can be set to be long (e.g. 6 hours), as the SFU can be expected to not forget sending the {"action": "send"} action when it detects a disconnecting client.

Batch sending

In some scenarios it is important to allow to send an event with an associated delay at the same time.

  • One example would be redacting an event. It only makes sense to redact the event if it exists. It might be important to have the guarantee that the delayed redact is received by the server at the time where the original message is sent.
  • In the case of a state event, a user might want to set the state to A and after a timeout change it back to {}. By using two separate requests, sending A could work, but the event with content {} could fail. The state would not automatically reset to {}.

For this use case, batch sending of multiple delayed events would be desired.

Batch sending is not included in the proposal of this MSC however since batch sending should become a generic Matrix concept as proposed with /send_pdus. (see: MSC4080: Cryptographic Identities)

MSC2716: Incrementally importing history into existing rooms already proposes a batch_send endpoint. However, it is limited to application services and focuses on historic data. Since the additional capability to use a template event_id parameter is also needed, this probably is not a good fit.

Not reusing the send/state endpoint

Alternatively, new endpoints could be introduced to not overload the send and state endpoint. Those endpoints could be called:

PUT /_matrix/client/v1/rooms/{roomId}/send_delayed_event/{eventType}/{txnId}?delay={delay_ms}

PUT /_matrix/client/v1/rooms/{roomId}/state_delayed_event/{eventType}/{stateKey}?delay={delay_ms}

This would allow the response for the send and state endpoints to remain as they are currently, and to have a different return type for the new send_delayed_event and state_delayed_event endpoints.

Allocating the event ID at the point of scheduling the send

This was considered, but when sending a delayed event the event_id is not yet available:

The Matrix spec says that the event_id must use the reference hash which is calculated from the fields of an event including the origin_server_timestamp as defined in this list

Since the origin_server_timestamp should be the timestamp the event has when entering the DAG (required for call duration computation), the event_id cannot be computed when using the send endpoint before the delayed event has resolved.

MSC4018 (use client sync loop)

MSC4018: Reliable call membership also proposes a way to make call memberships reliable. It uses the client sync loop as an indicator to determine if the event is expired, instead of letting the SFU inform about the call termination or using the call app ping/refresh loop as proposed earlier in this MSC.

The advantage is that this does not require introducing a new ping system (as is proposed here by using the delayed_events restart action). Though with cryptographic identities, the client needs to create the leave event.

The timeout for syncs are much slower than what would be desirable (30s vs 5s).

With a widget implementation for calls, it cannot be guaranteed that the widget is running during the sync loop. So one either has to move the hangup logic to the hosting client or let the widget run all the time.

A dedicated ping (independent to the sync loop) is more flexible and allows for the widget to execute the timer restart. If the widget dies, the call membership will disconnect.

Additionally, the specification should not include specific custom server rules if possible. Sending an event on behalf of a user based on the client sync loop if there is an event with a specific type and specific content is quite a server-specific behaviour, and also would not work well with encrypted state events and cryptographic identities. This proposal is a general behaviour valid for all event types.

Federated delayed events

Delayed events could be sent over federation immediately and then have the receiving servers process (sent down to clients) them at the appropriate time.

Downsides of this approach that have been considered are that:

  • individual "heartbeats"/restarts would need to distributed via federation, meaning more traffic and processing to be done.
  • if any homeservers missed the federated "heartbeat"/restart message, then they might decide that the event is visible to clients whereas other homeservers might have received it and come to a different conclusion. If the event was later cancelled then resolving the inconsistency feels more complex than if the event was never sent in the first place.

MSC3277: Scheduled messages proposes a similar feature and there is an extensive analysis of the pros and cons of this MSC vs MSC3277 here.

If it's not needed to allow modification of a delayed event after it has been scheduled, there is a benefit in federating the scheduled event (adding it to the DAG immediately). It increases resilience: the sender's homeserver can disconnect and the delayed message still will enter non-soft-failed state (will be sent).

However, for the MatrixRTC use case it's required to be able to modify the event after it has been scheduled. As such, this approach has been discounted.

MQTT style Last Will

MQTT has the concept of a Will Message that is published by the server when a client disconnects.

The client can set a Will Message when it connects to the server. If the client disconnects unexpectedly, the server will publish the Will Message if the client is not back online within a specified time.

A similar concept could be applied to Matrix by having the client specify a set of "Last Will" events and have the homeserver trigger them if the client (possibly identified by device ID) does not send an API request within a specified time.

The main differentiator is that this type of approach might use the sync loop as the "heartbeat" equivalent similar to MSC4018.

A benefit compared to this proposal is that theoretically there would be no additional network traffic overhead.

Some complications are:

  • in order to avoid additional network traffic, the homeserver would need to proactively realise that a connection has dropped. Depending on the network/load balancer stack this might be problematic.
  • as an alternative, the client could reduce the long poll timeout (from a typical 30s down to, say, 5s) which would result in a traffic increase.
  • As syncing is a per-client concept, the MatrixRTC app has to either run in the same process as the client so that a MatrixRTC app failure triggers the client Last Will or the client has to observe the MatrixRTC app and simulate the Last Will if the MatrixRTC app fails.

M_INVALID_PARAM instead of M_MAX_DELAY_EXCEEDED

The existing M_INVALID_PARAM error code could be used instead of introducing a new error code M_MAX_DELAY_EXCEEDED.

Naming

The following alternative names for this concept are considered:

  • Future
  • DelayedEvents
  • PostponedEvents
  • LastWill

Don't provide a send action

Instead of providing a send action for delayed events, the client could cancel the outstanding delayed event and send a new non-delayed event instead.

This would simplify the API, but it's less efficient since the client would have to send two requests instead of one.

Use DELETE HTTP method for cancel action

Instead of providing a cancel action for delayed events, the client could send a DELETE request to the same endpoint.

This feels more elegant, but it doesn't feel like a good suggestion for how the other actions are mapped.

[Ab]use typing notifications

Some exploration of using typing notifications to indicate that a user is still connected to a call was done.

The idea of extending MSC3038: Typed typing notifications to allow for additional meta data (like device ID and call ID) was considered.

A perceived benefit was that if the delay events were federated, then the typing notification EDUs might provide an efficient transport.

However, as the conclusion was to not federate the delayed events, this approach was discounted in favour of a dedicated endpoint.

Alternative to running_since field

Some alternatives for the running_since field on the GET response are:

  • delaying_from
  • delayed_since
  • delaying_since
  • last_restart - but this feels less clear than running_since for a delayed event that hasn't been restarted

Security considerations

All new endpoints are authenticated.

Servers should impose a maximum timeout value for delay timeouts of not more than a month.

As described above, the homeserver must evaluate and enforce the power levels at the time of the delayed event being sent (i.e. added to the DAG).

This has the risk that this feature could be used by a malicious actor to circumvent existing rate limiting measures which corresponds to the High Volume of Messages threat. The homeserver should apply rate-limiting to both the scheduling of delayed events and the later sending to mitigate this risk.

Unstable prefix

Whilst the MSC is in the proposal stage, the following should be used:

  • org.matrix.msc4140.delay should be used instead of the delay query parameter.
  • POST /_matrix/client/unstable/org.matrix.msc4140/delayed_events/{delay_id} should be used instead of the POST /_matrix/client/v1/delayed_events/{delay_id} endpoint.
  • GET /_matrix/client/unstable/org.matrix.msc4140/delayed_events should be used instead of the GET /_matrix/client/v1/delayed_events endpoint.
  • org.matrix.msc4140.finalised_events should be used as keys of /sync, /transactions, and /filter instead of finalised_events.
  • The M_UNKNOWN errcode should be used instead of M_MAX_DELAY_EXCEEDED as follows:
{
  "errcode": "M_UNKNOWN",
  "error": "The requested delay exceeds the allowed maximum.",
  "org.matrix.msc4140.errcode": "M_MAX_DELAY_EXCEEDED",
  "org.matrix.msc4140.max_delay": 86400000
}

instead of:

{
  "errcode": "M_MAX_DELAY_EXCEEDED",
  "error": "The requested delay exceeds the allowed maximum.",
  "max_delay": 86400000
}
  • The M_UNKNOWN errcode should be used instead of M_MAX_DELAYED_EVENTS_EXCEEDED as follows:
{
  "errcode": "M_UNKNOWN",
  "error": "The maximum number of delayed events has been reached.",
  "org.matrix.msc4140.errcode": "M_MAX_DELAYED_EVENTS_EXCEEDED"
}

instead of:

{
  "errcode": "M_MAX_DELAYED_EVENTS_EXCEEDED",
  "error": "The maximum number of delayed events has been reached."
}
  • The M_UNKNOWN errcode should be used instead of M_CANCELLED_BY_STATE_UPDATE as follows:
{
  "errcode": "M_UNKNOWN",
  "org.matrix.msc4140.errcode": "M_CANCELLED_BY_STATE_UPDATE",
  "error":"The delayed event did not get send because a different user updated the same state event.
  So the scheduled event might change it in an undesired way."
  }

instead of:

{
  "errcode": "M_CANCELLED_BY_STATE_UPDATE",
  "error":"The delayed event did not get send because a different user updated the same state event.
  So the scheduled event might change it in an undesired way."
  }

Additionally, the feature is to be advertised as an unstable feature in the GET /_matrix/client/versions response, with the key org.matrix.msc4140 set to true. So, the response could then look as follows:

{
    "versions": ["..."],
    "unstable_features": {
        "org.matrix.msc4140": true
    }
}

Dependencies

None.