25 KiB
+++ date = "2023-09-21T15:30:00Z" title = "Matrix 2.0: The Future of Matrix" aliases = ["/matrix-2.0"]
[taxonomies] author = ["Matthew Hodgson"] category = ["General"]
[extra] image = "https://matrix.org/blog/img/matrix-2.0.png" +++
TL;DR: If you want to play with a shiny new Matrix 2.0 client, head over to Element X.
Matrix has been going for over 9 years now, providing an open standard for secure, decentralised communication for the open Web - and it’s been quite the journey to get to where we are today. Right now, according to Synapse’s opt-in usage reporting, in total there are 111,873,374 matrix IDs on the public network, spanning 17,289,201 rooms, spread over 64,256 servers. This is just scratching the surface, given we estimate that 66% of servers in the public network don’t report stats, and there are many enormous private networks of servers too. We’ve come a long way from creating Matrix HQ as the first ever room on today’s public network, back on Aug 13th 2014 :)
Meanwhile, the Matrix ecosystem has continued to grow unbelievably - with huge numbers of independent clients, bots and bridges maturing into ecosystems of their own, whole new companies forming around the protocol, and organisations ranging from open source projects to governments, NGOs and Fortune 100 companies adopting Matrix as a way to run their own secure, decentralised, standards-based self-sovereign communication.
The world needs Matrix more than ever. Every day the importance of decentralisation is more painfully obvious, as we concretely see the terrifying risks of centralised Internet services - whether that’s through corporate takeover, state censorship, blanket surveillance, Internet shutdowns, surveillance capitalism, or the spectre of gigantic centralised data breaches. It’s been amazing to see the world pivot in favour of decentralisation over the time we’ve been building Matrix, and our mission has never been more important.
On one hand it feels we’re creeping ever closer to that goal of providing the missing communication layer for the open Web. The European Union’s Digital Markets Act (DMA) is a huge step in that direction - regulation that mandates that if the large centralised messaging providers are to operate in the EU, they must interoperate. We’ve been busy working away to make this a reality, including participating in the IETF for the first time as part of the MIMI working group - demonstrating concretely how (for instance) Android Messages could natively speak Matrix in order to interoperate with other services, while preserving end-to-end encryption.
On the other hand, Matrix has often got stuck in focusing on solving the Hard Problems of decentralisation, decentralised end-to-end encryption, and the logistical complexities of supporting a massive heterogeneous public communication network and its surrounding heterogeneous ecosystem. It’s fair to say that in the early days our focus was on making something that worked at all - and then later, we shifted to focusing on something that worked and scaled correctly… but we hadn’t managed to focus on ensuring that Matrix provides the building blocks necessary to create blazingly fast, hyper-efficient communication apps which has potential to outperform the centralised mainstream messaging services…
…until now!
Matrix 2.0
Back at FOSDEM we announced the idea of Matrix 2.0 - a series of huge step changes in terms of Matrix’s usability and performance, made up of Sliding Sync (instant login/launch/sync), Native OIDC (industry-standard authentication), Native Group VoIP (end-to-end encrypted large-scale voice & video conferencing) and Faster Joins (lazy-loading room state when your server joins a room).
Now, we’re excited to announce that as of today everyone can start playing with these Matrix 2.0 features. There’s still some work to bring them formally into the specification, but we’re putting it out there for folks to experience right now. Developers: watch this space for updates on the spec front.
Practically speaking, this means there are now implementations of the four pillars of Matrix 2.0 available today which you can use to power a daily-driver Matrix 2.0 client. The work here has been driven primarily by Element, using their new Element X client as the test-bed for the new Matrix 2.0 functionality and to prove that the new APIs are informed by real-world usage and can concretely demonstrably create an app which begins to outperform iMessage, WhatsApp and Telegram in terms of usability and performance… all while benefiting from being 100% built on Matrix.
matrix-rust-sdk and Element X
The mission of Matrix 2.0 has been to provide a huge step forwards in real-world performance, usability and stability - and that means using a real client codebase as a guinea pig to ensure the new protocol is fit for purpose. matrix-rust-sdk has been the main vehicle for this, with Element X as the app primarily driving the new features (although other clients built on matrix-rust-sdk such as Fractal 5 can then automatically benefit from the work should they wish).
To see what all the fuss is about, your best bet is probably to head over to the Element X launch blog post and read all about it! But from the Matrix perspective, this is a flag day in terms of the existence of a Matrix client which empirically outperforms the mainstream clients both in terms of usability and performance: it shows that Matrix is indeed viable to power communication for billions of users, should we get the chance.
From a client perspective: this has meant implementing Sliding Sync (MSC3575) in matrix-rust-sdk - and then creating the entirely new matrix-sdk-ui crate in order to expose higher level APIs to help apps efficiently drive their UI, without each app having to keep reinventing the wheel and risking getting it wrong. The new UI crate gives APIs for efficiently managing a lazy-loaded room list, lazy-loaded room timelines (including edits, reactions, aggregations, redactions etc), and even when the app should show a sync spinner or not. As a result, the vast majority of the heavy lifting can be handled in matrix-rust-sdk, ensuring that the app layer can focus on UI rather than Matrix guts - and performance improvements (e.g. roomlist caching and timeline caching) can all be handled in one place to the benefit of all clients using the SDK.
This is a huge breakthrough relative to the old days of Matrix where each client would have no choice but burn significant amounts of time hand-carving its own timeline and encryption glue logic (although of course clients are still very welcome to do so if they wish!) - but for those wanting higher-level building blocks, matrix-rust-sdk now provides an excellent basis for experimenting with Matrix 2.0 clients. It’s worth noting that the library is still evolving fast, though, and many APIs are not long-term stable. Both the Sliding Sync API and the UI crates are still subject to significant change, and while the crypto crate and its underlying vodozemac E2EE implementation is pretty stable, features such as E2EE Backup are still being added to the top-level matrix-rust-sdk (and thence Element X).
In order to hook matrix-rust-sdk up to Element X, the Element team ended up contributing cancellable async bindings to uniffi, Mozilla’s language binding generator, so you can now call matrix-rust-sdk directly from Swift, Kotlin and (in theory) other languages, complete with beautifully simple async/await non-blocking semantics. This looks to be a pretty awesome stack for doing modern cross-platform development - so even if you have a project which isn’t natively in Rust, you should be able to lean on matrix-rust-sdk if you so desire! We hope that other projects will follow the Rust + Swift/Kotlin pattern for their extreme performance needs :)
Sliding Sync
The single biggest change in Matrix 2.0 is the proposal of an entirely new sync API called Sliding Sync (MSC3575). The goal of Sliding Sync is to ensure that the application has the option of loading the absolutely bare essential data required to render its visible user interface - ensuring that operations which have historically been horribly slow in Matrix (login and initial sync, launch and incremental sync) are instant, no matter how many rooms the user is in or how large those rooms are.
While matrix-rust-sdk implements both Sync v2 (the current API in Matrix 1.8) as well as Sliding Sync, Element X deliberately only implements Sliding Sync, in order to focus exclusively on getting the fastest UI possible (and generally to exercise the API). Therefore to use Element X, you need to be running a homeserver with Sliding Sync support, which (for now) means running a sliding-sync proxy which bolts Sliding Sync support on to existing homeservers. You can check out Thib’s excellent tutorial for how to get up and running (or Element Server Suite provides packages from the Element team)
Now, implementing Sliding Sync in matrix-rust-sdk has been a bit of a journey. Since we showed off the very first implementation at FOSDEM, two big problems came to light. For a bit of context: the original design of Sliding Sync was heavily inspired by Discord’s architecture - where the server calculates an ordered list of large numbers of items (your room list, in Matrix’s case); the client says which window into the list it’s currently displaying; and the server sends updates to the client as the view changes. The user then scrolls around that list, sliding the window up and down, and the server sends the appropriate updates - hence the name Sliding Sync.
Sliding Sync was originally driven by our work on Low Bandwidth Matrix - as it makes no sense to have a fancy line protocol which can run over a 2400 baud modem… if the first thing the app tries to do is download a 100MB Sync v2 initial-sync response, or for that matter a 10MB incremental-sync response after having been offline for a few days (10MB takes 9 hours to shift over a 2400 baud modem, for those who missed out on the 80s). Instead, you clearly only want to send the absolute essentials to the client, no matter how big their account is, and that’s what Sliding Sync does.
The first minor flaw in the plan, however, is that the server doesn’t necessarily have all the data it needs to order the room list. Room ordering depends on what the most recent visible events are in a room, and if the room’s end-to-end encrypted, the server has no way of knowing which events are going to be visible for a given client or not. It also doesn’t know which rooms have encrypted mentions inside them, and we don’t want to leak mention metadata to the server, or design out keyword mentions. So, MSC3575 proposed some complicated contortions to let the client tweak the order client-side based on its superior knowledge of the ordering (given most clients would need to sync all the encrypted rooms anyway, in order to index them and search for keyword notifications etc). Meanwhile, the order might be ‘good enough’ even without those tweaks.
The second minor flaw in the plan was that having implemented Sliding Sync in Element X, it turns out that the user experience on mobile of incrementally loading in room list entries from the server as the user scrolls around the list is simply not good enough, especially on bad connectivity - and the last thing we want to do is to design out support for bad connectivity in Matrix. Users have been trained on mobile to expect to be able to swipe rapidly through infinite-scrolling lists of tens of thousands of photos in their photo gallery, or tens of thousands of emails in their mail client, without ever seeing a single placeholder, even for a frame. So if the network roundtrip time to your server is even 100ms, and Sliding Sync is operating infinitely quickly, you’re still going to end up showing a placeholders for a few frames (6 frames, at 60fps, to be precise) if the user starts scrolling rapidly through their room list. And empirically that doesn’t look great - the 2007-vintage iOS team have a lot to answer for in terms of setting user expectations!
So, the obvious way to solve both of these problems is simply to pull in more data in the background, to anticipate the user scrolling around. In fact, it turns out we need to do that anyway, and indeed pull in all the room data so that room-search is instantly responsive; waiting 100ms or more to talk to the server whenever the user tries to search their roomlist is no fun at all, and it transpires that many users navigate their roomlist entirely by search rather than scrolling. As a result, the sliding sync implementation in matrix-rust-sdk has ended up maintaining an ‘all rooms’ list, which starts off syncing the roomlist details for the most recent N rooms, and then in the background expands to sync all the rest. At which point we’re not really sliding a window around any more: instead it’s more of a QoSed incremental sync.
So, to cut a long story short: while the current Sliding Sync implementation in matrix-rust-sdk and Element X empirically works very well, it’s ended up being a bit too complicated and we expect some pretty significant simplifications in the near future based on the best practices figured out with clients using it. Watch this space for updates, although it’s likely that the current form of MSC3575 will prevail in some respect in order to support low-bandwidth environments where roomlist ordering and roomsearch latency is less important than preserving bandwidth. Critically, we want to figure this out before we encourage folks to implement native server implementations - so for now, we’ll be keeping using the sliding-sync proxy as a way to rapidly experiment with the API as it evolves.
Native Matrix Group VoIP
Another pillar of Matrix 2.0 is that we finally have native Matrix Group VoIP calling (MSC3401)! Much like Sliding Sync has been developed using Element X as a testbed, Element Call has been the guinea pig for getting fully end-to-end-encrypted, scalable group voice/video calling implemented on top of Matrix, building on top of matrix-js-sdk. And as of today, Element Call finally has it working, complete with end-to-end encryption (and integrated in Element X, for that matter)!
Much like Sliding Sync, this has also been a bit of a journey. The original implementations of Element Call strictly followed MSC3401, using full mesh conferencing to effectively have every participant place a call to every other participant - thus decentralising the conference and avoiding the need for a conferencing ‘focus’ server… but limiting the conference to 7 or 8 participants given all the duplication of the sent video required. In Element Call Beta 2, end-to-end encryption was enabled; easy, given it’s just a set of 1:1 calls.
Then the real adventure began: to implement a Selective Forwarding Unit (SFU) which can be used to scale up to hundreds of users - or beyond. The unexpected first move came from Sean DuBois, project lead of the awesome Pion WebRTC stack for Golang - who wrote a proof-of-concept called sfu-to-sfu to demonstrate the viability of decentralised heterogenous cascading SFUs, as detailed in MSC3898. This would not only let calls on a single focus scale beyond hundreds of users, but also share the conferencing out across all the participating foci, providing the world’s first heterogeneous decentralised video conferencing. Element took the sfu-to-sfu implementation, hooked it up to Element Call on a branch, and renamed it as waterfall.
However, when Sean first contributed sfu-to-sfu, he mentioned to us that if Matrix is serious about SFUs, we should take a look at LiveKit - an open source startup not dissimilar to Element who were busy building best-in-class SFUs on top of Pion. And while waterfall worked well as a proof of concept, it became increasingly obvious that there’s a lot of work to be done around tuning congestion control, error correction, implementing end-to-end encryption etc which the LiveKit team had already spent years doing. So, Element reached out to the LiveKit team, and started experimenting with what it might take to implement a Matrix-capable SFU on top of the LiveKit engine.
The end result was Element Call Beta 3, which is an interesting hybrid between MSC3401 and LiveKit’s existing signalling: the high-level signalling of the call (its existence, membership, duration etc) is advertised by Matrix - but the actual WebRTC signalling is handled by LiveKit, providing support for hundreds of users per call.
Finally, today marks the release of Element Call Beta 4, which adds back end-to-end encryption via the LiveKit SFU (currently by using a shared static secret, but in the near future will support full Matrix-negotiated end-to-end encryption with sender keys) - and also includes a complete visual refresh. The next steps here include bringing back support for full mesh as well as SFU, for environments without an SFU, and updating all the MSCs to recognise the hybrid signalling model that reality has converged on when using LiveKit. Meanwhile, head over to https://call.element.io to give it a go, or read more about it in the Element X Ignition blog post!
Native Open ID Connect
Finally, last but not least, we’re proud to announce that the project to replace Matrix’s venerable existing authentication APIs with industry-standard Open ID Connect in Matrix 2.0 has taken a huge leap forwards today, with matrix-authentication-service now being available to add Native OIDC support to Synapse, as well as Element X now implementing account registration, login and management via Native OIDC (with legacy support only for login/logout).
This is a critical step forwards in improving the security and maintainability for Matrix’s authentication, and you can read all about it in this dedicated post, explaining the rationale for adopting OpenID Connect for all forms of authentication throughout Matrix, and what you need to know about the transition.
Conclusion
There has been an enormous amount of work that has gone into Matrix 2.0 so far - whether that’s implementing sliding sync in matrix-rust-sdk and sliding-sync proxy, matrix-authentication-service and all the native OIDC infrastructure on servers and clients, the entirety of Element Call and its underpinning matrix-js-sdk and SFU work, or indeed Faster Joins in Synapse, which shipped back in Jan.
It’s been a pretty stressful sprint to pull it all together, and huge thanks go to everyone who’s contributed - both from the team at Element, but also contributors to other projects like matrix-rust-sdk who have got caught in the crossfire :) It’s also been amazing seeing the level of support, high quality testing and excellent feedback from the wider community as folks have got excited about the promise of Matrix 2.0.
On the Foundation side, we’d like to thank the Members whose financial support has been critical in providing bandwidth to enable the progress on Matrix 2.0 - and for those who want to help accelerate Matrix, especially those commercially building on top of Matrix, please consider joining the Foundation as a member! Also, in case you missed it, we’re super excited to welcome Josh Simmons as Managing Director for the Foundation - focusing on running the Foundation membership programme and generally ensuring the growth of the Foundation funding for the benefit of the whole Matrix community. Matthew and Amandine continue to lead the overall project (alongside their day jobs at Element), with the support of the other three independent Guardians - but Josh is working full time exclusively on running the non-profit foundation and gathering funds to support Matrix.
Talking of funding, we should mention that we’ve had to pause work in other places due to lack of Matrix funding - especially while focusing on successfully shipping Matrix 2.0. Major next-generation projects including Third Room, P2P Matrix, and Low Bandwidth Matrix have all been paused unless there’s a major shift in circumstances - so, if you have money and you’re interested in a world where the more experimental next-generation Matrix projects progress with folks working on them as their day job, please get in touch with the Foundation.
What’s next?
While this is the first usable release of Matrix 2.0 implementations, there’s loads of work still to be done - obvious work on Matrix 2.0 includes:
- Getting Native OIDC enabled on matrix.org, and providing migration tools to Native OIDC for existing homeservers in general
- Reworking Sliding Sync based on the lessons learned implementing it in matrix-rust-sdk
- Actually getting the Matrix 2.0 MSCs stabilised and matured to the point they can be approved and merged into the spec
- Adding encrypted backups to matrix-rust-sdk
- Reintroducing full-mesh support for Native Matrix Group VoIP calling
- Having a big Matrix 2.0 launch party once the spec lands!
Outside of Matrix 2.0 work, other big items on the horizon include:
- Adding Rust matrix-sdk-crypto to matrix-js-sdk, at which point all the official Matrix.org client SDKs will (at last!) be using the same stable performant E2EE implementation
- Continuing to contribute Matrix input to the MIMI working group in IETF for Digital Markets Act interoperability
- Working on MLS for next-generation E2EE
- Next generation moderation tooling and capabilities
- Account Portability and Multihomed accounts
- …and much much more.
So: welcome to our brave new Matrix 2.0 world. We hope you’re excited about it as we are - and thanks to everyone for continuing to use Matrix and build on it. Here’s to the beginning of a whole new era!
Matthew, Amandine and the whole Matrix team.