matrix-doc/proposals/2716-batch-send-historical-...

768 lines
34 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# MSC2716: Incrementally importing history into existing rooms
## Problem
Matrix has historically been unable to easily import existing history into a
room that already exists. This is a major problem when bridging existing
conversations into Matrix, particularly if the scrollback is being
incrementally or lazily imported.
For instance, an NNTP bridge might work by letting a user join a room that
maps to a given newsgroup, first showing an empty room, and then importing the
most recent 1000 newsgroup posts for that room to flesh out some history. The
bridge might then choose to slowly import additional posts for that newsgroup
in the background, until however many decades of backfill were complete.
Finally, as more archives surface, they might also need to be manually
gradually added into the history of the room - slowly building up the complete
history of the conversations over time.
This is currently not supported because:
* There is no way to create messages in the context of historical room state in
a room via CS or AS API - you can only create events relative to current room
state.
* It is possible to override the timestamp with the `?ts` query parameter
([timestamp
massaging](](https://spec.matrix.org/v1.3/application-service-api/#timestamp-massaging)))
using the AS API but the event will still be appended to the tip of the DAG.
It's not possible to change the DAG ordering with this.
## Expectation
Historical messages that we import should appear in the timeline just like they
would if they were sent back at that time. In the example below, Maria's
messages 1-6 were sent originally in the room and the "Historical" messages in
the middle were imported after the fact.
Here is what scrollback is expected to look like in Element:
![Two historical batches in between some existing messages](./images/2716-message-scrollback-example.png)
To accomplish what's shown in the image, this is the basic flow:
1. `maria` sends messages 1-6. These represent messages in the normal "live" timeline before any history is imported.
1. Create historical batch 0 via `POST /_matrix/client/v1/rooms/<roomID>/batch_send?prev_event_id=<message3-eventID>` with the "Historical [xyz]" message `events` from Eric and the necessary `state_events_at_start` to auth them.
- This will return a response that contains the `next_batch_id` that we will use for the next batch.
- This also returns `base_insertion_event_id` which we will use the for the `m.room.marker` even later.
- `/batch_send` inserts `m.room.insertion` and `m.room.batch` events as necessary to connect the batches into a historical chain of history.
1. Create historical batch 1 via `POST /_matrix/client/v1/rooms/<roomID>/batch_send?prev_event_id=<message3-eventID>&batch_id=<batchID-that-we-got-from-the-previous-batch>` with the "Historical [foo|bar|baz]" message `events` from Eric and the necessary `state_events_at_start` to auth them.
1. Send a `m.room.marker` event so the history is discoverable across all federated homeservers: `PUT /_matrix/client/v3/rooms/{roomId}/send/m.room.marker/{txnId}` with `insertion_event_reference` set as the `base_insertion_event_id` from before.
The DAG for these messages ends up looking like:
```mermaid
flowchart BT
1 --- annotation1>"Note: older events are at the top"]
subgraph live timeline
marker1>m.room.marker] ----> 6[Message 6] --> 5[Message 5] --> 4[Message 4] -----------------> 3[Message 3] --> 2[Message 2] --> 1[Message 1]
end
subgraph batch0
batch0-batch[[m.room.batch]] --> batch0-2((z)) --> batch0-1((y)) --> batch0-0((x)) --> batch0-insertion[/m.room.insertion\]
end
subgraph batch1
batch1-batch[[m.room.batch]] --> batch1-2((baz)) --> batch1-1((bar)) --> batch1-0((foo)) --> batch1-insertion[/m.room.insertion\]
end
batch0-insertion -.-> memberBob0(["m.room.member (Eric)"])
batch1-insertion -.-> memberBob1(["m.room.member (Eric)"])
marker1 -.-> batch0-insertionBase
batch0-insertionBase[/m.room.insertion\] ---------------> 1
batch0-batch -.-> batch0-insertionBase
batch1-batch -.-> batch0-insertion
%% make the annotation links invisible
linkStyle 0 stroke-width:2px,fill:none,stroke:none;
```
## Proposal
### `historical` `content` property on any event
A new `historical` property is defined which can be included in the content of
any event to indicate it was retrospectively imported. Used as a hint/indication
to clients that history didn't originally happen in the room and to add the
right semantics to the historical messages. Perhaps a little "Historical" flag
in the corner of these messages to show that they are maybe a little less
trusted in terms of attribution.
key | type | value | description | Required
--- | --- | --- | --- | ---
`historical` | bool | `true` | Used on any event to hint that it was historically imported after the fact. This field should just be omitted if `false`. | no
### New event types
#### `m.room.insertion`
Events that mark points in time where you can insert historical messages.
**`m.room.insertion` event `content` field definitions:**
key | type | value | description | required
--- | --- | --- | --- | ---
`next_batch_id` | string | randomly generated string | This is a random unique string that the next `m.room.batch` event should specify in order to connect to it. | yes
An example of the `m.room.insertion` event:
```json5
{
"type": "m.room.insertion",
"sender": "@appservice:example.org",
"content": {
"next_batch_id": "w25ljc1kb4",
"historical": true
},
"event_id": "$insertionabcd:example.org",
"room_id": "!jEsUZKDJdhlrceRyVU:example.org",
// Doesn't affect much but good to use the same time as the closest event
"origin_server_ts": 1626914158639
}
```
#### `m.room.batch`
This is what connects one historical batch to the other. In the DAG, we navigate
from an insertion event to the batch event that points at it, up the historical
messages to the next insertion event, then repeat the process.
**`m.room.batch` event `content` field definitions:**
key | type | value | description | required
--- | --- | --- | --- | ---
`batch_id` | string | A batch ID from an insertion event | Used to indicate which `m.room.insertion` event it connects to by its `next_batch_id` field. | yes
An example of the `m.room.batch` event:
```json5
{
"type": "m.room.batch",
"sender": "@appservice:example.org",
"content": {
"batch_id": "w25ljc1kb4",
"historical": true
},
"event_id": "$batchabcd:example.org",
"room_id": "!jEsUZKDJdhlrceRyVU:example.org",
// Doesn't affect much but good to use the same time as the closest event
"origin_server_ts": 1626914158639
}
```
#### `m.room.marker`
State event used to hint to homeservers that there is new
history back in time that you should go fetch next time someone scrolls back
around the specified insertion event. Also used on clients to cache bust the
timeline.
**`m.room.marker` event `content` field definitions:**
key | type | value | description | required
--- | --- | --- | --- | ---
`insertion_event_reference` | string | Another `event_id` | Used to point at an `m.room.insertion` event by its `event_id`. | yes
An example of the `m.room.marker` event:
```json5
{
"type": "m.room.marker",
"state_key": "<some-unique-state-key>",
"sender": "@appservice:example.org",
"content": {
"insertion_event_reference": "$insertionabcd:example.org"
},
"event_id": "$markerabcd:example.org",
"room_id": "!jEsUZKDJdhlrceRyVU:example.org",
"origin_server_ts": 1626914158639,
}
```
### New historical batch send endpoint
Add a new endpoint, `POST
/_matrix/client/v1/rooms/<roomID>/batch_send?prev_event_id=<eventID>&batch_id=<batchID>`,
which can insert a batch of events historically back in time next to the given
`?prev_event_id` (required). This endpoint can only be used by application
services. `?batch_id` is not required for the first batch send and is only
necessary to connect the current batch to the previous.
This endpoint handles the complexity of creating `m.room.insertion` and `m.room.batch` events.
All the application service has to do is use `?batch_id` which comes from
`next_batch_id` in the response of the batch send endpoint to connect batches
together. `next_batch_id` is derived from the insertion events added to each
batch.
Request body:
```json
{
"state_events_at_start": [{
"type": "m.room.member",
"sender": "@someone:matrix.org",
"origin_server_ts": 1628277690333,
"content": {
"membership": "join"
},
"state_key": "@someone:matrix.org"
}],
"events": [
{
"type": "m.room.message",
"sender": "@someone:matrix.org",
"origin_server_ts": 1628277690333,
"content": {
"msgtype": "m.text",
"body": "Historical message1"
},
},
{
"type": "m.room.message",
"sender": "@someone:matrix.org",
"origin_server_ts": 1628277690334,
"content": {
"msgtype": "m.text",
"body": "Historical message2"
},
}
],
}
```
Request response:
```json5
{
// List of state event ID's we inserted
"state_event_ids": [
// member state event ID
],
// List of historical event ID's we inserted
"event_ids": [
// historical message1 event ID
// historical message2 event ID
],
"next_batch_id": "random-unique-string",
"insertion_event_id": "$X9RSsCPKu5gTVIJCoDe6HeCmsrp6kD31zXjMRfBCADE",
"batch_event_id": "$kHspK8a5kQN2xkTJMDWL-BbmeYVYAloQAA9QSLOsOZ4",
// When `?batch_id` isn't provided, the homeserver automatically creates an
// insertion event as a starting place to hang the history off of. This automatic
// insertion event ID is returned in this field.
//
// When `?batch_id` is provided, this field is not present because we can hang
// the history off the insertion event specified and associated by the batch ID.
"base_insertion_event_id": "$pmmaTamxhcyLrrOKSrJf3c1zNmfvsE5SGpFpgE_UvN0"
}
```
`state_events_at_start` is unioned with the state at the `prev_event_id` and is
used to define the historical state events needed to auth the `events` like
invite and join events. These events can float outside of the normal DAG. In
Synapse, these are called `outlier`s and won't be visible in the chat history
which also allows us to insert multiple batches without having a bunch of `@mxid
joined the room` noise between each batch. **The state will not be resolved into
the current state of the room.**
`events` is a chronological list of events you want to insert. It's possible to
also include `state_events` here which will be used to auth further events in
the batch. For Synapse, there is a reverse-chronological constraint on batches
so once you insert one batch of messages, you can only insert an older batch
after that. For more information on this Synapse constraint, see the ["Depth
discussion"](#depth-discussion) below. **tldr; Insert from your most recent
batch of history -> oldest history.**
One aspect that isn't solved yet is how to handle relations/annotations (such as
reactions, replies, and threaded conversations) that reference each other within
the same `events` batch because the events don't have `event_ids` to reference
before being persisted. A solution for this can be proposed in another MSC.
#### What does the batch send endpoint do behind the scenes?
This section explains the homeserver magic that happens when someone uses the
`/batch_send` endpoint. If you're just trying to understand how the `m.room.insertion`,
`m.room.batch`, `m.room.marker` events work, you might want to just skip down to the room DAG
breakdown which incrementally explains how everything fits together.
1. A `m.room.insertion` event for the batch is added to the start of the batch.
This will be the starting point of the next batch and holds the `next_batch_id`
that we return in the batch send response. The application service passes
this as `?batch_id` next time to continue the chain of historical messages.
1. A `m.room.batch` event is added to the end of the batch. This is the event
that connects to an `m.room.insertion` event by specifying a `batch_id` that
matches the `next_batch_id` on the `m.room.insertion` event.
1. If `?batch_id` is not specified (usually only for the first batch), create a
base `m.room.insertion` event as a jumping off point from `?prev_event_id` which can
be added to the end of the `events` list in the response.
1. All of the events in the historical batch get a content field,
`"historical": true`, to indicate that they are historical at the point of
being added to a room.
1. The `state_events_at_start`/`events` payload is in **chronological** order
(`[0, 1, 2]`) and is processed in that order so the `prev_events` point to
it's older-in-time previous message which gives us a nice straight line in
the DAG.
- <a name="depth-discussion"></a>**Depth discussion:** For Synapse, when persisting,
we **reverse the list (to make it reverse-chronological)** so we can still get the
correct `(topological_ordering, stream_ordering)` so it sorts between A and B as
we expect. Why? `depth` (or the `topological_ordering`) is not re-calculated when
historical messages are inserted into the DAG. This means we have to take care to
insert in the right order. Events are sorted by `(topological_ordering,
stream_ordering)` where `topological_ordering` is just `depth`. Normally,
`stream_ordering` is an auto incrementing integer but for `backfilled=true`
events, it decrements. Since historical messages are inserted all at the same
`depth`, the only way we can control the ordering in between is the
`stream_ordering`. Historical messages are marked as backfilled so the
`stream_ordering` decrements and each event is sorted behind the next. (from
https://github.com/matrix-org/synapse/pull/9247#discussion_r588479201). Because
ordering between events is mostly controlled by `stream_ordering`, we will run
into ordering issues over federation if it backfills in the wrong order (see the
["Message ordering issues over
federation"](#message-ordering-issues-over-federation) section below)
### Power levels
Since events being silently sent in the past is hard to moderate, it will
probably be good to limit who can add historical messages to the timeline. The
batch send endpoint is already limited to application services but we also need
to limit who can send `m.room.insertion`, `m.room.batch`, and `m.room.marker` events since someone
can attempt to send them via the normal `/send` API (we don't want any nasty
weird knots to reconcile either).
- `historical`: A new top-level field in the `content` dictionary of the room's
power levels, controlling who can send `m.room.insertion`, `m.room.batch`,
and `m.room.marker` events in the room.
### Room version
The new `historical` power level necessitates a new room version (changes the structure of `m.room.power_levels`).
The redaction algorithm changes is also hard requirement for a new room
version because we need to make sure when redacting, we only strip out fields
without affecting anything at the protocol level. This means that we need to
keep all of the structural fields that allow us to navigate the batches of
history in the DAG. We also only want to auth events against fields that
wouldn't be removed during redaction. In practice, this means:
- When redacting `m.room.insertion` events, keep the `next_batch_id` content field around
- When redacting `m.room.batch` events, keep the `batch_id` content field around
- When redacting `m.room.marker` events, keep the `insertion_event_reference` content field around
- When redacting `m.room.power_levels` events, keep the `historical` content field around
#### Backwards compatibility with existing room versions
However, this MSC is mostly backwards compatible and can be used with the
current room version with the fact that redactions aren't supported for
`m.room.insertion`, `m.room.batch`, `m.room.marker` events. We can protect
people from this limitation by throwing an error when they try to use [`PUT
/_matrix/client/v3/rooms/{roomId}/redact/{eventId}/{txnId}`](https://spec.matrix.org/v1.3/client-server-api/#put_matrixclientv3roomsroomidredacteventidtxnid)
to redact one of those events. We would have to accept the redaction if
it came over federation to avoid split-brained rooms.
Because we also can't use the `historical` power level for controlling who can
send these events in the existing room version, we always persist but instead
only process and give meaning to the `m.room.insertion`, `m.room.batch`, and
`m.room.marker` events when the room `creator` sends them. This caveat/rule only
applies to existing room versions.
### Room DAG breakdown
#### `m.room.insertion` and `m.room.batch` events
We use `m.room.insertion` and `m.room.batch` events to describe how each historical batch
should connect to each other and how the homeserver can navigate the DAG.
- With `m.room.insertion` events, we just add them to the start of each chronological
batch (where the oldest message in the batch is). The next older-in-time
batch can connect to that `m.room.insertion` event from the previous batch.
- The initial base `m.room.insertion` event could be from the main DAG or we can
create it ad-hoc in the first batch. In the latter case, a `m.room.marker` event
(detailed below) inserted into the main DAG can be used to point to the new
`m.room.insertion` event.
- `m.room.batch` events have a `next_batch_id` field which is used to indicate the
`m.room.insertion` event that the batch connects to.
Here is how the historical batch concept looks like in the DAG:
- `A <--- B` is any point in the DAG that we want to import between.
- `A` is the oldest-in-time message
- `B` is the newest-in-time message
- `batch0` is the first batch we try to import
- Each batch of messages is older-in-time than the last (`batch1` is
older-in-time than `batch0`, etc)
```mermaid
flowchart BT
A --- annotation1>"Note: older events are at the top"]
subgraph live [live timeline]
B --------------------> A
end
subgraph batch0
batch0-batch[[m.room.batch]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/m.room.insertion\]
end
subgraph batch1
batch1-batch[[m.room.batch]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/m.room.insertion\]
end
subgraph batch2
batch2-batch[[m.room.batch]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/m.room.insertion\]
end
batch0-insertionBase[/m.room.insertion\] ---------------> A
batch0-batch -.-> batch0-insertionBase
batch1-batch -.-> batch0-insertion
batch2-batch -.-> batch1-insertion
%% alignment links
batch2-insertion --- alignment1
%% make the alignment links/nodes invisible
style alignment1 visibility: hidden,color:transparent;
linkStyle 18 stroke-width:2px,fill:none,stroke:none;
%% make the annotation links invisible
linkStyle 0 stroke-width:2px,fill:none,stroke:none;
```
#### Adding marker events
Finally, we add `m.room.marker` state events into the mix so that federated remote
servers also know where in the DAG they should look for historical messages.
To lay out the different types of servers consuming these historical messages
(more context on why we need `m.room.marker` events):
1. Local server
- This pretty much works out of the box. It's possible to just add the
historical events to the database and they're available. The new endpoint
is just a mechanism to insert the events.
1. Federated remote server that already has *all* scrollback history and then
new history is inserted
- The big problem is how does a HS know it needs to go fetch more history if
they already fetched all of the history in the room? We're solving this
with `m.room.marker` state events which are sent on the "live" timeline and point
back to the `m.room.insertion` event where we inserted history next to. The HS
can then go and backfill the `m.room.insertion` event and continue navigating the
historical batches from there.
1. Federated remote server that joins a new room with historical messages
- The originating homeserver just needs to update the `/backfill` response
to include historical messages from the batches.
1. Federated remote server already in the room when history is inserted
- Depends on whether the HS has the scrollback history. If the HS already
has all history, see scenario 2, if doesn't, see scenario 3.
1. For federated servers already in the room that haven't implemented MSC2716
- Those homeservers won't have historical messages available because they're
unable to navigate the `m.room.marker`/`m.room.insertion`/`m.room.batch` events. But the
historical messages would be available once the HS implements MSC2716 and
processes the `m.room.marker` events that point to the history.
---
- A `m.room.marker` event simply points back to an `m.room.insertion` event.
- The `m.room.marker` event solves the problem of, how does a federated homeserver
know about the historical events which won't come down incremental sync? And
the scenario where the federated HS already has all the history in the room,
so it won't do a full sync of the room again.
- Unlike the historical events sent via `/batch_send`, **the `m.room.marker` event is
sent separately as a normal state event on the "live" timeline** so that
comes down incremental sync and is available to all homeservers regardless of
how much scrollback history they already have. And since it's state it never
gets lost in a timeline gap and is immediately apparent to all servers that
join.
- Also instead of overwriting the same generic `state_key: ""` over and over,
the expected behavior is send each `m.room.marker` event with a unique `state_key`.
This way all of the "markers" are discoverable in the current state without
us having to go through the chain of previous state to figure it all out.
This also avoids potential state resolution conflicts where only one of the
`m.room.marker` events win and we would lose the other chain history.
- A `m.room.marker` event is not needed for every batch of historical messages added
via `/batch_send`. Multiple batches can be inserted. Then once we're done
importing everything, we can add one `m.room.marker` event pointing at the root
`m.room.insertion` event
- If more history is decided to be added later, another `m.room.marker` can be sent to let the homeservers know again.
- When a remote federated homeserver receives a `m.room.marker` event, it can mark
the `m.room.insertion` prev events as needing to backfill from that point again and
can fetch the historical messages when the user scrolls back to that area in
the future.
```mermaid
flowchart BT
A --- annotation1>"Note: older events are at the top"]
subgraph live timeline
marker1>m.room.marker] ----> B -----------------> A
end
subgraph batch0
batch0-batch[[m.room.batch]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/m.room.insertion\]
end
subgraph batch1
batch1-batch[[m.room.batch]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/m.room.insertion\]
end
subgraph batch2
batch2-batch[[m.room.batch]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/m.room.insertion\]
end
marker1 -.-> batch0-insertionBase
batch0-insertionBase[/m.room.insertion\] ---------------> A
batch0-batch -.-> batch0-insertionBase
batch1-batch -.-> batch0-insertion
batch2-batch -.-> batch1-insertion
%% alignment links
batch2-insertion --- alignment1
%% make the alignment links/nodes invisible
style alignment1 visibility: hidden,color:transparent;
linkStyle 20 stroke-width:2px,fill:none,stroke:none;
%% make the annotation links invisible
linkStyle 0 stroke-width:2px,fill:none,stroke:none;
```
#### Add in the historical state
In order to show the display name and avatar for the historical messages,
the state provided by `state_events_at_start` needs to resolve when one of
the historical messages is fetched.
It's probably most semantic to have the historical state float outside of the
normal DAG in a chain by specifying no `prev_events` (empty `prev_events=[]`)
for the first one. Then the insertion event can reference the last piece in the
floating state chain.
In Synapse, historical state is marked as an `outlier`. As a result, the state
will not be resolved into the current state of the room, and it won't be visible
in the chat history. This allows us to insert multiple batches without having a
bunch of `@mxid joined the room` noise between each batch.
```mermaid
flowchart BT
A --- annotation1>"Note: older events are at the top"]
subgraph live timeline
marker1>m.room.marker] ----> B -----------------> A
end
subgraph batch0
batch0-batch[[m.room.batch]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/m.room.insertion\]
end
subgraph batch1
batch1-batch[[m.room.batch]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/m.room.insertion\]
end
subgraph batch2
batch2-batch[[m.room.batch]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/m.room.insertion\]
end
batch0-insertion -.-> memberBob0(["m.room.member (bob)"]) --> memberAlice0(["m.room.member (alice)"])
batch1-insertion -.-> memberBob1(["m.room.member (bob)"]) --> memberAlice1(["m.room.member (alice)"])
batch2-insertion -.-> memberBob2(["m.room.member (bob)"]) --> memberAlice2(["m.room.member (alice)"])
marker1 -.-> batch0-insertionBase
batch0-insertionBase[/m.room.insertion\] ---------------> A
batch0-batch -.-> batch0-insertionBase
batch1-batch -.-> batch0-insertion
batch2-batch -.-> batch1-insertion
%% make the annotation links invisible
linkStyle 0 stroke-width:2px,fill:none,stroke:none;
```
## Potential issues
Also see the security considerations section below.
### Message ordering issues over federation
See the ["Depth discussion"](#depth-discussion) for the appropriate context for how
ordering currently works. This works fine for the local server that imported the history
in any scenario but since current homeserver implementations rely on `stream_ordering`
(which is just when the server received the event) to tie break the
`topological_ordering`/`depth`, this will cause message out of order problems for
federating servers consuming the events. It only works if the federating server scrolls
back sequentially without jumping around in the history at all which isn't realistic
with API's like jump to date (`/timestamp_to_event`) around nowadays.
To totally fix this problem, it would require a different [graph
linearization](https://github.com/matrix-org/gomatrixserverlib/issues/187) strategy.
Perhaps we would do some online topological ordering (KatrielBodlaender algorithm)
where `depth`/`topological_ordering` is dynamically updated whenever new events are
inserted into the DAG. This is something extremely sci-fi and a big task though.
- https://github.com/matrix-org/gomatrixserverlib/issues/187 is the best reference I
know of for graph linearization (how to go from a DAG to a list of events in order)
in general though
- Related event ordering issue: https://github.com/matrix-org/matrix-spec/issues/852
- Synapse docs on depth and stream ordering:
https://github.com/matrix-org/synapse/blob/66ad1b8984eb536608e0915722c6a0b4493bb9df/docs/development/room-dag-concepts.md#depth-and-stream-ordering
---
When factoring in how to use MSC2716 with the Gitter import and the static archives, we
were hand waving over this part and planned to have a script manually scrollback across
all of the rooms on the archive server before anyone else or Google spider crawls in
some weird way. This way it will lock the sort in place for all of the historical
messages. Or have the static archives fetch directly from the `gitter.im` homeserver
which would be correct since it was the server that imported everything.
Then later, online topological ordering can happen in the future and by its nature will
apply retroactively to fix any inconsistencies introduced by jumping and people permalinking.
But we were able to accomplish the Gitter to Matrix migration message import without
MSC2716 and if your use case is just one big import blast at the beginning of the room,
the way Gitter accomplished this works now and is a lot simpler (do that instead), see
[*"Alternative for one big import blast at the start of a room (Gitter case study)"*
section below](#one-big-import-blast-gitter-case-study).
### Self-referential batches
We probably want to come up with a solution for how to reference another event in the
same batch. Imagine wanting to reply to an earlier event in the batch. Or any other
relation like reactions and threads.
See this [discussion
thread](https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r870884150)
for ideas.
### Application service signals to lazy load more history
This doesn't provide a way for a HS to tell an AS that a client has tried to
call `/messages` beyond the beginning of a room, and that the AS should try to
lazy-insert some more messages (as per
https://github.com/matrix-org/matrix-doc/issues/698). For this MSC to be
extra useful, we might want to flesh that out. Another related problem with
the existing AS query APIs is that they don't include who is querying,
so they're hard to use in bridges that require logging in. If a similar query
API is added here, it should include the ID of the user who's asking for
history.
## Alternatives
We could insist that we use the SS API to import history history in this manner
rather than extending the AS API. However, it seems unnecessarily burdensome to
make bridge authors understand the SS API, especially when we already have so
many AS API bridges. Hence these minor extensions to the existing AS API.
Another way of doing this is using the existing single send state and event API
endpoints. We could use `PUT /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}`
with `?historical=true` which would create the floating outlier state events.
Then we could use `PUT /_matrix/client/r0/rooms/{roomId}/send/{eventType}/{txnId}`,
with `?prev_event_id` pointing at that floating state to auth the event and where we
want to insert the event.
Another way of doing this might be to store the different eras of the room as
different versions of the room, using `m.room.tombstone` events to form a linked
list of the eras. This has the advantage of isolating room state between
different eras of the room, simplifying state resolution calculations and
avoiding risk of any cross-talk. It's also easier to reason about, and avoids
exposing the DAG to bridge developers. However, it would require better
presentation of room versions in clients, and it would require support for
retrospectively specifying the `predecessor` of the current room when you
retrospectively import history. Currently `predecessor` is in the immutable
`m.room.create` event of a room, so cannot be changed retrospectively - and
doing so in a safe and race-free manner sounds hard. A big problem with this
approach is if you just want to inject a few old lost messages - eg if you're
importing a mail or newsgroup archive and you stumble across a lost mbox with a
few msgs in retrospect, you wouldn't want or be able to splice a whole new room
in with tombstones.
Another way could be to let the server who issued the `m.room.create` also go
and retrospectively insert events into the room outside the context of the DAG
(i.e. without parent prev_events or signatures). To quote the original
[bug](https://github.com/matrix-org/matrix-doc/issues/698#issuecomment-259478116):
> You could just create synthetic events which look like normal DAG events but
exist before the m.room.create event. Their signatures and prev-events would
all be missing, but they would be blindly trusted based on the HS who is
allowed to serve them (based on metadata in the m.room.create event). Thus
you'd have a perimeter in the DAG beyond which events are no longer
decentralised or signed, but are blindly trusted to let HSes insert ancient
history provided by ASes.
However, this feels needlessly complicated if the DAG approach is sufficient.
### Alternative for one big import blast at the start of a room (Gitter case study)
<a name="one-big-import-blast-gitter-case-study"></a>
As an update, [Gitter has fully migrated to
Matrix](https://blog.gitter.im/2023/02/13/gitter-has-fully-migrated-to-matrix/) and was
able to accomplish the 141M message import without MSC2716. If your use case is just one
big import blast at the beginning of the room, the way Gitter accomplished this works
now and is a lot simpler (do this instead).
In the Gitter case, we started with a fresh room for the historical messages and
imported one by one so the `topological_ordering` was correct. We also used
`/send?ts=xxx` to make the timestamps correct. Then connected the historical and "live"
room together with a `m.room.tombstone` and MSC3946 `predecessor` event. This
functionality is completely separate from MSC2716 and works fine today.
## Security considerations
The `m.room.insertion` and `m.room.batch` events add a new way for an application service to
tie the batch reconciliation in knots(similar to the DAG knots that can happen)
which can potentially DoS message and backfill navigation on the server.
This also makes it much easier for an AS to maliciously spoof history. This is
a bit unavoidable given the nature of the feature, and is also possible today
via SS API.
## Unstable prefix
Servers will indicate support for the new endpoint via a `true` value for feature flag
`org.matrix.msc2716` in `unstable_features` in the response to `GET
/_matrix/client/versions`.
**Endpoints:**
- `POST /_matrix/client/unstable/org.matrix.msc2716/rooms/<roomID>/batch_send`
**Event types:**
- `org.matrix.msc2716.insertion`
- `org.matrix.msc2716.batch`
- `org.matrix.msc2716.marker`
**Content fields:**
- `org.matrix.msc2716.historical`
**Room version:**
- `org.matrix.msc2716` and `org.matrix.msc2716v2`, etc as we develop and
iterate along the way
**Power level:**
- `historical` (does not need prefixing because it's already under an
experimental room version)