3.5 KiB
MSC0000: Encrypted event indexing
While matrix rooms make a great place to store data it is very hard to retrieve this
data in a structured way.
In encrypted rooms the only approach is to download all m.room.encrypted
events
and then parse all of them locally. So all server optimizations an indexed database
provides are not available.
It would be phenomenal and enable new ways to use matrix if one could do things like in an encrypted room:
- Get all events of a specific type.
- Fetch all events that are tagged with a specific usecase.
- Get all events that fullfill simple conditions event.value is in interval. e.g:
- get all events describing users in a specific
- age interval,
- get all videos with a specific length interval,
- get all images in a specific x and y geographic coordinate interval,
- get all calender events with specific participants,
- get all calender events within a specific date,
- ...
Here an idea is proposed that tries to leak a limited amount of metadata but still allow the homeserver to retrieve the correct events and send it to the homeserver.
Proposal
To achieve this clients can introduce indexed fields. To the homeserver these are readable because they are stored outside the encrypted payload (like relations) and are made out of a UUID and a floating point value.
"index":{
"001234567890UUID0987654321": 16.0,
"101234567890UUID0987654321": 812324567653452.1,
...
}
The goal is to make the index by itself useless because there is no immediate
pattern/structure to it. Instead the meaning of the index
section of each
event is derived by other encrypted events. m.index-descriptor
{
"type": "m.index-descriptor",
"target_value": ["content","path_to","target_property"]
"index": "1234index-uuid4321",
"data_type": "date", "integer", "float", "enum" "custom",
"enum_table":{
"value1":[100,200],
"value1":[200,2000],
...
}
"transform":{
"function":sin(x),
"offset":[103.2, 100]
}
}
The transform.function and transform.offset will be used to transform the values. The full equation looks as following:
integral(abs(transform.function(input + offset[0])))+offset[1] = index_value
for each event the client will compute this and add it to the event outside the encrypted content. If the client wants to fetch a specific range it applies the transform to the borders of the range and asks the homeserver for all events that fulfill the condition.
Request:
{
"index": "1234index-uuid4321",
"from": 0,
"to": 1,
"closest_to": 100,
"page_size": 50,
"page_token": "token_for_next_page_acquired_by_last_request"
}
Since the page tokens are unique it is enough to just sent a page token.
The homeserver will reuse the page size from the previous request.
The order can be defined by swapping the value of from/to.
cloest_to can not be used in combination with from/to in case both
are provided the hs has to ignore the closest_to property.
closest to will return an array where the abs(index_value - closest_to)
is ascending.
Response:
{
"events":[{"MatrixEvent"}],
"next_page_token": "token_for_next_page"
}
All m.index-descriptor
events are listed with index entry state events
m.index-entry
:
{
"type": "m.index-entry",
"index_event_id": 1234event-uuid4321
}
Potential issues
Alternatives
Using unencrypted rooms for this kind of applications.