matrix.org/content/blog/2016/07/2016-07-04-the-matrix-summe...

43 KiB

+++ title = "The Matrix Summer Special!!" path = "/blog/2016/07/04/the-matrix-summer-special"

[taxonomies] author = ["Matthew Hodgson"] category = ["General", "GSOC"] +++

Hi folks - another few months have gone by and once again the core Matrix team has ended up too busy hacking away on the final missing pieces of the Matrix jigsaw puzzle to have been properly updating the blog; sorry about this. The end is in sight for the current crunch however, and we expect to return to regular blog updates shortly! Meanwhile, rather than letting news stack up any further, here's a quick(?) attempt to summarise all the things which have been going on!

Synapse 0.16.1 released!

This one's a biggy: in the mad rush during June to support the public debut for Vector, we made a series of major Synapse releases which apparently we forgot to tell anyone about (sorry!). The full changelog is at the bottom of the post as it's huge, but the big features are:

  • Huge performance improvements, including adding write-thru event caches and improving caching throughout, and massive improvements to the speed of the room directory API.
  • Add support for inline URL previewing!
  • Add email notifications!
  • Add support for LDAP authentication (thanks to Christoph Witzany)
  • Add support for JWT authentication (thanks to Niklas Riekenbrauck)
  • Add basic server-side ignore functionality and abuse reporting API
  • Add ability to delegate /publicRooms API requests to a list of secondary homeservers
  • Lots and lots and lots of bug fixes.
If you haven't upgraded, please do asap from https://github.com/matrix-org/synapse!

There's also been a huge amount of work going on behind the scenes on horizontal scalability for Synapse.  We haven't drawn much attention to this yet (or documented it) as it's still quite experimental and in flux, but the main change is to add the concept of application-layer replication to Synapse - letting you split the codebase into multiple endpoints which can then be run in parallel, each replicating their state off the master synapse process.  For instance, right now the Matrix.org homeserver is actually running off three different processes: the main synapse; another specific to calculating push notifications and another specific to serving up the /sync endpoint.  These three are then abstracted behind the dendron layer (which also implements the /login endpoint). The idea is that one can then run multiple instances of the /sync and pusher (and other future) endpoints to horizontally scale.  For now, they share a single database writer, but in practice this has improved our scalability and performance on the Matrix.org HS radically.

In future we'll actually document how to run these, as well as making it easy to spin up multiple concurrent instances - in the interim if you find you're hitting performance limits running high-traffic synapses come talk to us about it on #matrix-dev:matrix.org.  And the longer term plan continues to be to switch out these python endpoint implementations in future for more efficient implementations.  For instance, there's a golang implementation of the media repository currently in development which could run as another endpoint cluster.

Vector released!

Much has been written about this elsewhere, but Web, iOS and Android versions of the Vector clients were finally released to the general public on June 9th at the Decentralised Web Summit in San Francisco.  Vector is a relatively thin layer on top of the matrix-react-sdk, matrix-ios-sdk and matrix-android-sdk Matrix.org client SDKs which showcases Matrix's collaboration and messaging capabilities in a mass-market usable app.  There's been huge amounts of work here across the SDKs for the 3 platforms, with literally thousands of issues resolved.  You can find the full SDK changelogs on github for React, iOS and Android.  Some of the more interesting recent additions to Vector include improved room notifications, URL previews, configurable email notifications, and huge amounts of performance stability work.

Future work on Vector is focused on showcasing end-to-end encryption, providing a one-click interface for adding bots/integrations & bridges to a room, and generally enormously improving the UX and polish.  Meanwhile, there's an F-Droid release for Android landing any day now.

If you haven't checked it out recently, it's really worth a look :)

Vector

Matrix Spec 0.1.0

In case you didn't notice, we also released v0.1.0 of the Matrix spec itself in May - this is a fairly minor update which improves the layout of the document somewhat (thanks to a PR from Jimmy Cuadra) and a some bugfixes.  You can see the full changelog here. We're overdue a new release since then (albeit again with relatively minor changes).

Google Summer of Code

We're in the middle of the second half of GSoC right now, with our GSoC students Aviral and Half-Shot hacking away on Vector and Microblogging projects respectively.  There's a lot of exciting stuff coming out of this - Aviral contributing Rich Text Editing, Emoji autocompletion, DuckDuckGo and other features into Vector (currently on branches, but will be released soon) and Half-Shot building a Twitter bridge as part of his Matrix-powered microblogging system.  Watch this space for updates!

Ruma

Lots of exciting stuff has been happening recently over at Ruma.io - an independent Matrix homeserver implementation written in Rust.  Over the last few weeks Jimmy and friends have got into the real meat of implementing events and the core of the Matrix CS API, and as of the time of writing they're the topmost link on HackerNews!  There's a lot of work involved in writing a homeserver, but Ruma is looking incredibly promising and the feedback from their team has been incredibly helpful in keeping us honest on the Matrix spec and ensuring that it's fit for purpose for 3rd party server implementers.

Also, Ruma just released some truly excellent documentation as a high-level introduction to Matrix (thanks to Leah and Jimmy) - much better than anything we have on the official Matrix.org site.  Go check it out if you haven't already!

End to End Encryption

There has been LOADS of work happening on End to End encryption: finalising the core 1:1 "Olm" cryptographic ratchet; implementing the group "Megolm" ratchet (which shares a single ratchet over all the participants of a room for scalability); fully hooking Olm into matrix-js-sdk and Vector-web at last, and preparing for a formal and published-to-the-public 3rd party security audit on Olm which will be happening during July.

This deserves a post in its own right, but the key thing to know is that Olm is almost ready - and indeed the work-in-progress E2E UX is even available on the develop branch of vector if you enable E2E in the new 'Labs' section in User Settings.  Olm itself is usable only for 'burn after reading' strictly PFS messages, but Megolm integration with Vector & Synapse will follow shortly afterwards which will finally provide the E2E nirvana we've all been waiting for :)

Decentralised Web Summit

Matrix had a major presence as a sponsor at the first ever Decentralised Web Summit hosted by the Internet Archive in San Francisco back in June.  This was a truly incredible event - with folks gathering from across the world to discuss, collaborate and debate on ensure that the web is not fragmented or trapped into proprietary silos - with the likes of Tim Berners-Lee, Vint Cerf and Brewster Kahle in attendance.  We ran a long 2 hour workshop on Matrix and showed off Vector to anyone and everyone - and meanwhile the organisers were kind enough to promote Matrix as the main decentralised chat interface for the conference itself (bridged with their Slack).  A full writeup of the conference really merits a blog post in its own right, but the punchline is that you could genuinely tell that this is the beginning of a new era of the internet - whether it's using Merkle DAGs (like Matrix) or Blockchain or similar technologies: we are about to see a major shift in the balance of power on the internet back towards its users.

We strongly recommend checking out the videos which have all been published at Decentralised Web Summit, including lightning talks introducing both Matrix and Vector, and digging into as many of the projects advertised as possible.  It was particularly interesting for us to get to know Tim Berners-Lee's latest project at MIT: Solid - which shares quite a lot of the same goals as Matrix, and subsequently seeing Tim pop up on Matrix via Vector.  We're really looking forward to working out how Matrix & Solid can complement each other in future.

Matthew, Tim Berners-Lee and Matrix

Matrix.to

Not the most exciting thing ever, but heads up that there's a simple site up at https://matrix.to to provide a way of doing client-agnostic links to content in Matrix.  For instance, rather than linking specifically into an app like Vector, you can now say https://matrix.to/#/#matrix:matrix.org to go there via whatever app you choose.  This is basically a bootstrapping process towards having proper mx:// URLs in circulation, but given mx:// doesn't exist yet, https://matrix.to hopefully provides a useful step in the right direction :)

PRs very welcome at https://github.com/matrix-org/matrix.to.

Bridges and Bots

Much of the promise of Matrix is the ability to bridge through to other silos, and we've been gradually adding more and more bridging capabilities in.

For instance, the IRC bridge has had a complete overhaul to add in huge numbers of new features and finally deployed for Freenode a few weeks ago:

New Features:

  • Nicks set via !nick will now be preserved across bridge restarts.
  • EXPERIMENTAL: IRC clients created by the bridge can be assigned their own IPv6 address.
  • The bridge will now send connection status information to real Matrix users via the admin room (the same room !nickcommands are issued).
  • Added !help.
  • The bridge will now fallback to body if the HTML content contains any unrecognised tags. This makes passing Markdown from Matrix to IRC much nicer.
  • The bridge will now send more metrics to the statsd server, including the join/part rate to and from IRC.
  • The config option matrixClients.displayName is now implemented.
Bug fixes:
  • Escape HTML entities when sending from IRC to Matrix. This prevents munging occurring between IRC formatting and textual < element > references, whereby if you sent a tag and some colour codes from IRC it would not escape the tag and therefore send invalid HTML to Matrix.
  • User IDs starting with - are temporarily filtered out from being bridged.
  • Deterministically generate the configuration file.
  • Recognise more IRC error codes as non-fatal to avoid IRC clients reconnecting unnecessarily.
  • Add a 10 second timeout to join events injected via the MemberListSyncer to avoid HOL blocking.
  • 'Frontier' Matrix users will be forcibly joined to IRC channels even if membership list syncing I->M is disabled. This ensures that there is always a Matrix user in the channel being bridged to avoid losing traffic.
  • Cache the /initialSync request to avoid hitting this endpoint more than once, as it may be very slow.
  • Indexes have been added to the NeDB .db files to improve lookup times.
  • Do not recheck if the bridge bot should part the channel if a virtual user leaves the channel: we know it shouldn't.
  • Refine what counts as a "request" for metrics, reducing the amount of double-counting as requests echo back from the remote side.
  • Fixed a bug which caused users to be provisioned off their user_id even if they had a display name set.
Meanwhile, a Gitter bridge is in active development (and in testing with the Neovim community on Gitter/Matrix/Freenode), although lacking documentation so far.

Finally, NEB - the Matrix.org bot framework is currently being ported from Python to Golang to act as a general Go SDK for rapidly implementing new bot capabilities.

There's little point in all of the effort going into bridges and bots if it's too hard for normal users to deploy them, so on the Vector side of things there's an ongoing project to build a commercial-grade bot/bridge hosted service offering for Matrix which should make it much easier for non-sysadmins to quickly add their own bots and bridges into their rooms.  There's nothing to see yet, but we'll be yelling about it when it's ready!

Conclusion

I'm sure there's a lot of stuff missing from the quick summary above - suffice it to say that the Matrix ecosystem is growing so fast and so large that it's pretty hard to keep track of everything that's going on.  The big remaining blockers we see at this point are:

  • End-to-end Encryption roll-out
  • Polishing UX on Vector - showing that it's possible to build better-than-Slack quality UX on top of Matrix
  • Bots, Integrations and Bridges - making them absolutely trivial to build and deploy, and encouraging everyone to write as many as they can!
  • Improving VoIP, especially for conferencing, especially on Mobile
  • Threading
  • Editable messages
  • Synapse scaling and stability - this is massively improved, but there's still work to be done.  Meanwhile projects like Ruma give us hope for light at the end of the Synapse tunnel!
  • Spec refinements - there are still a lot of open spec bugs which we need to resolve so we can declare the spec (and thus Matrix!) out of beta.
  • More clients - especially desktop ones; helping out with Quaternion, Tensor, PTO, etc.
...and then all the pieces of the jigsaw will finally be in place, and Matrix should hopefully fulfil its potential as an invaluable, open and decentralised data fabric for the web.

Thanks, once again, to everyone who's been supporting and using Matrix - whether it's by hanging out in the public chatrooms, running your own server, writing your own clients, bots, or servers, or just telling your friends about the project.  The end of the beginning is in sight: thanks for believing in us, and thank you for flying Matrix.

Matthew, Amandine & the Matrix Team.

Appendix: The Missing Synapse Changelogs

Changes in synapse v0.16.1 (2016-06-20)

Bug fixes:

  • Fix assorted bugs in /preview_url (PR #872)
  • Fix TypeError when setting unicode passwords (PR #873)
Performance improvements:
  • Turn use_frozen_events off by default (PR #877)
  • Disable responding with canonical json for federation (PR #878)

Changes in synapse v0.16.1-rc1 (2016-06-15)

Features: None

Changes:

  • Log requester for /publicRoom endpoints when possible (PR #856)
  • 502 on /thumbnail when can't connect to remote server (PR #862)
  • Linearize fetching of gaps on incoming events (PR #871)
Bugs fixes:
  • Fix bug where rooms where marked as published by default (PR #857)
  • Fix bug where joining room with an event with invalid sender (PR #868)
  • Fix bug where backfilled events were sent down sync streams (PR #869)
  • Fix bug where outgoing connections could wedge indefinitely, causing push notifications to be unreliable (PR #870)
Performance improvements:
  • Improve /publicRooms performance (PR #859)

Changes in synapse v0.16.0 (2016-06-09)

NB: As of v0.14 all AS config files must have an ID field.

Bug fixes:

  • Don't make rooms published by default (PR #857)

Changes in synapse v0.16.0-rc2 (2016-06-08)

Features:

  • Add configuration option for tuning GC via gc.set_threshold (PR #849)
Changes:
  • Record metrics about GC (PR #771, #847, #852)
  • Add metric counter for number of persisted events (PR #841)
Bug fixes:
  • Fix 'From' header in email notifications (PR #843)
  • Fix presence where timeouts were not being fired for the first 8h after restarts (PR #842)
  • Fix bug where synapse sent malformed transactions to AS's when retrying transactions (Commits310197b, 8437906)
Performance Improvements:
  • Remove event fetching from DB threads (PR #835)
  • Change the way we cache events (PR #836)
  • Add events to cache when we persist them (PR #840)

Changes in synapse v0.16.0-rc1 (2016-06-03)

Version 0.15 was not released. See v0.15.0-rc1 below for additional changes.

Features:

  • Add email notifications for missed messages (PR #759, #786, #799, #810, #815, #821)
  • Add a url_preview_ip_range_whitelist config param (PR #760)
  • Add /report endpoint (PR #762)
  • Add basic ignore user API (PR #763)
  • Add an openidish mechanism for proving that you own a given user_id (PR #765)
  • Allow clients to specify a server_name to avoid 'No known servers' (PR #794)
  • Add secondary_directory_servers option to fetch room list from other servers (PR #808, #813)
Changes:
  • Report per request metrics for all of the things using request_handler (PR #756)
  • Correctly handle NULL password hashes from the database (PR #775)
  • Allow receipts for events we haven't seen in the db (PR #784)
  • Make synctl read a cache factor from config file (PR #785)
  • Increment badge count per missed convo, not per msg (PR #793)
  • Special case m.room.third_party_invite event auth to match invites (PR #814)
Bug fixes:
  • Fix typo in event_auth servlet path (PR #757)
  • Fix password reset (PR #758)
Performance improvements:
  • Reduce database inserts when sending transactions (PR #767)
  • Queue events by room for persistence (PR #768)
  • Add cache to get_user_by_id (PR #772)
  • Add and use get_domain_from_id (PR #773)
  • Use tree cache for get_linearized_receipts_for_room (PR #779)
  • Remove unused indices (PR #782)
  • Add caches to bulk_get_push_rules* (PR #804)
  • Cache get_event_reference_hashes (PR #806)
  • Add get_users_with_read_receipts_in_room cache (PR #809)
  • Use state to calculate get_users_in_room (PR #811)
  • Load push rules in storage layer so that they get cached (PR #825)
  • Make get_joined_hosts_for_room use get_users_in_room (PR #828)
  • Poke notifier on next reactor tick (PR #829)
  • Change CacheMetrics to be quicker (PR #830)

Changes in synapse v0.15.0-rc1 (2016-04-26)

Features:

  • Add login support for Javascript Web Tokens, thanks to Niklas Riekenbrauck (PR #671,#687)
  • Add URL previewing support (PR #688)
  • Add login support for LDAP, thanks to Christoph Witzany (PR #701)
  • Add GET endpoint for pushers (PR #716)
Changes:
  • Never notify for member events (PR #667)
  • Deduplicate identical /sync requests (PR #668)
  • Require user to have left room to forget room (PR #673)
  • Use DNS cache if within TTL (PR #677)
  • Let users see their own leave events (PR #699)
  • Deduplicate membership changes (PR #700)
  • Increase performance of pusher code (PR #705)
  • Respond with error status 504 if failed to talk to remote server (PR #731)
  • Increase search performance on postgres (PR #745)
Bug fixes:
  • Fix bug where disabling all notifications still resulted in push (PR #678)
  • Fix bug where users couldn't reject remote invites if remote refused (PR #691)
  • Fix bug where synapse attempted to backfill from itself (PR #693)
  • Fix bug where profile information was not correctly added when joining remote rooms (PR #703)
  • Fix bug where register API required incorrect key name for AS registration (PR #727)