matrix-doc/proposals/4161-crypto-terminology.md

16 KiB

MSC4161: Crypto terminology for non-technical users

Background

Matrix makes use of advanced cryptographic techniques to provide secure messaging. These techniques often involve precise and detailed language that is unfamiliar to non-technical users.

This document provides a list of concepts and explanations that are intended to be suitable for use in Matrix clients that are aimed at non-technical users.

Ideally, encryption in Matrix should be entirely invisible to end-users (much as WhatsApp or Signal users are not exposed to encryption specifics). This initiative is referred to as "Invisible Cryptography" and is tracked as:

  • MSC4153 - Exclude non-cross-signed devices,
  • MSC4048 - Authenticated key backup,
  • MSC4147 - Including device keys with Olm-encrypted events, and
  • MSC4161 - this document

Why is this important?

Use of common terminology should help further these goals:

  • to reduce confusion: many members of the community are confused by the crypto features in Matrix clients, and the profusion of different words for the same thing makes it much worse. By reducing the number of words, and carefully choosing good words, we hope to develop a common language which makes Matrix easier to understand, and easier to explain.

  • to ease migration: one of the key features of Matrix for end-users is the choice of clients, meaning no-one is locked in to a particular piece of software. If each client uses conflicting terminology, it becomes much more difficult to move to a different client, which works against the user's ability to migrate.

This proposal uses "SHOULD" language rather than "MUST", because there are many good reasons why a particular client might choose different wording. In particular, different clients may have very different audiences who communicate in different ways and understand different metaphors. This proposal hopes to nudge client developers towards consistency, but never at the cost of their unique relationship with their users.

Outcomes

We hope that Matrix client developers will like the terms and wording we provide, and adapt their user interfaces and documentation to use them. (If this MSC is accepted, Element will use it as a reference for English wording in its clients.)

Where concepts and terms exactly match existing terms in the Matrix spec, we propose changing the spec to use the terms from this document. Where they do not match, we are very comfortable with different words being used in the spec, given it is a highly technical document, as opposed to a client user interface.

We hope that this MSC will:

  • Cause small changes in the spec (as described in the previous paragraph), and
  • Become an appendix in the spec, with a description that makes clear that the intended audience is different from most of the spec, meaning different words are used from the main spec body.

Clients may, of course, choose to use different language. Some clients may deliberately choose to use more technical language, to suit the profiles of their users. This document is aimed at clients targeting non-technical users.

Where changes are made in the spec, we suggest that notes be added mentioning the old name, as in this example.

Proposal

When communicating about cryptography with non-technical users, we propose using the following terms and concepts.

When referring to concepts outlined in this document in their user interface, clients SHOULD use the language specified, except where their own users are known to understand different terms more easily. When making such exceptions, clients SHOULD document how they deviate from this document, and why.

Devices

Instances of a client are called 'devices' (not 'sessions'). Aligned with MSC4153, we take it as granted that all devices have been cross-signed by the user who owns them, and we call these devices.

Devices which have not been cross-signed by the user are considered an error state, primarily to be encountered during the transition to MSC4153 and/or due to buggy/incomplete/outdated clients. These devices are referred to as not secure or insecure and their existence is considered a serious and dangerous error condition, similar to an invalid TLS certificate.

"This device is not secure. Please verify it to continue."

"Ignoring 5 messages that were sent from a device that is not secure."

"Confirm it's you" (when asking to verify a device during login)

⚠️ Avoid saying "secure device". All devices are considered secure by default; the user doesn't typically need to worry about the fact that insecure devices are a thing, given they should only ever occur in error (or transitional) scenarios.

⚠️ Avoid saying "trusted device" or "verified device". Devices are not users, and it is helpful to use different language for users vs. devices. (However, we do use the verb "verify" to describe how to make a device secure. By using the same verb, we help users understand the confusing fact that verifying devices and verifying users are similar processes, but with different outcomes.)

⚠️ Avoid using "cross-signing", which requires a deeper knowledge of cryptography to understand.

⚠️ Avoid mentioning "device keys" - a device is just secure or not.

⚠️ Avoid "session" to mean device. Device better describes what most users encounter, and is more commonly used in other products.

Verified user

When you verify a user they become verified. This means that you have cryptographic proof that no-one is listening in on your conversations. (You need this if you suspect someone in a room may be using a malicious homeserver.)

In many contexts, most users are not verified: verification is a manual step (scanning a QR code or comparing emojis). (In future, verification will probably become more common thanks to MSC2882 Transitive Trust or something similar). When an unverified user resets their identity, we should warn the user, so they are aware of the change.

If Alice is verified with Bob, and then Alice's identity changes (i.e. Alice resets their master cross-signing key) then this is very important to Bob: Bob verified Alice because they care about proof that no-one is listening, and now someone could be. Bob can choose to withdraw verification (i.e. "demote" Alice from being verified), or re-verify with Alice. Until Bob does one or the other, Bob's communication with Alice should contain a prominent and serious warning that Alice's verified identity has changed.

"This user is verified."

"WARNING: Bob's verified identity has changed!"

"You verified this user's identity, but it has changed. Please choose to re-verify them or withdraw verification."

⚠️ Avoid using "cross-signing", which requires a deeper understanding of cryptography to understand.

⚠️ Avoid using "trust on first use (TOFU)", which is a colloquial name for noting the identity of users who are not verified so that we can notify the user if it changes. (This is a kind of "light" form of verification where we assume that the first identity we can see is trusted.)

⚠️ Avoid confusing verification of users with verification of devices: the mechanism is similar but the purpose is different. Devices must be verified to make them secure, but users can optionally be verified to ensure no-one is listening in or tampering with communications.

⚠️ Avoid talking about "mismatch" or "verification mismatch" which is very jargony - it is the identity which is mismatched, not the verification process. Just say "Bob's verified identity has changed".

⚠️ Where possible, avoid talking about "cryptographic identity" which is very jargony. In many contexts, just the word "identity" is sufficient: the dictionary definition of identity meaning that someone is who they claim they are, not someone else. The fact we confirm identity cryptographically is usually irrelevant to the user.

Identity

A user's identity is proof of who they are, and, if you have verified them, proof that you have a secure communication channel with them.

When a non-verified user resets their identity: "Warning: Alice's identity has changed."

Longer explanation: This can happen if the user lost all their devices and the recovery key, but it can also be a sign of someone taking over the account. To be sure, please verify their identity by going to their profile.

When a verified user resets their identity: "WARNING: Bob's verified identity has changed!"

(During login, at the "Confirm it's you" stage):

"If you don't have any other device and you have lost your recovery key, you can create a new identity. (Warning: you will lose access to your old messages!)" button text (in red or similar): "Reset my identity"

⚠️ Avoid saying "master key" - this is an implementation detail.

⚠️ Avoid saying "Alice reset their encryption" - the reason that Alice's identity changed could be due to attack rather than because they reset their encryption (plus "encryption" is jargony).

Message key

A message key is used to decrypt a message. The metaphor is that messages are "locked in a box" by encrypting them, and "unlocked" by decrypting them.

"Store message keys on the server."

⚠️ Avoid saying "key" without a previous word saying what type of key it is.

⚠️ Avoid using "room key". These keys are used to decrypt messages, not rooms.

Note: this clashes with the term "message key" in the double ratchet. Since the double ratchet algorithm is for a very different audience, we think that this is not a problem.

Unable to decrypt

When we have an encrypted message but no message key to decrypt it, we are unable to decrypt it.

When we expect the key to arrive, we are waiting for this message.

"Waiting for this message" with a button: "learn more" that explains that the message key for this message has not yet been received, but that we expect it to arrive shortly. Further detail may be provided, for instance explaining that connectivity issues between the sender's homeserver and our own can cause key delivery delays.

When the user does not have the message key for a permanent and well-understood reason, for example if it was sent before they joined the room, we say you don't have access to this message.

"You don't have access to this message" e.g. if it was sent before the user entered the room, or the user does not have key storage set up.

Message history

Your message history is a record of every message you have received or sent, and is particularly used to describe messages that are stored on the server rather than your device(s)

Key storage

Key storage means keeping cryptographic information on the server. This includes the user's identity, and/or the message keys needed to decrypt messages.

If a user loses their recovery key, they may reset their key storage. Unless they have old devices, they will not be able to access old encrypted messages because the message keys are stored in key storage, and their cryptographic identity will change, because it too is stored in key storage.

"Allow key storage"

"Key storage holds your identity on the server along with the keys that allow you to read your message history."

"Message history is unavailable because key storage is disabled."

⚠️ Avoid distinguishing between "secret storage" and "key backup" - these are both part of key storage.

⚠️ Avoid talking about more keys: "the backup key is stored in the secret storage, and this allows us to decrypt the messages keys from key backup". Instead, we simply say that both identity and message keys are stored in key storage.

⚠️ Avoid using "key backup" to talk about storing message keys: this is too easily confused with exporting keys or messages to an external system.

⚠️ Avoid "4S" or "quad-S" - these are not descriptive terms.

Recovery key (and recovery passphrase)

A recovery key is a way of regaining access to key storage if the user loses all their devices. Using key storage, they can preserve their cryptographic identity (meaning other users don't see "Alice's identity has changed" messages), and also read old messages using the stored message keys.

A recovery passphrase is an easier-to-remember way of accessing the recovery key and has the same purpose as the recovery key.

Losing the recovery key: if the user loses their recovery key, they can "reset" it, which means re-storing the identity information in the server, encrypted with a new recovery key. If the user has a verified client, then that is holding the identity information locally, so they can reset their recovery key without losing access to key storage. If they don't have a verified client and they lose their recovery key, then they need to reset key storage as well as recovery key (since the identity information is needed to read from key storage), meaning they lose access to old messages.

"Write down your recovery key in a safe place"

"If you lose access to your devices and your recovery key, you will need to reset your message key storage, which will create a new identity"

"If you lose your recovery key you can generate a new one if you are signed in elsewhere"

⚠️ Avoid using "security key", "security code", "recovery code", "master key". A recovery key allows "unlocking" the key storage, which is a "box" that is on the server, containing your identity and message keys. It is used to recover the situation if you lose access to your devices. None of these other terms express this concept so clearly.

⚠️ Remember that users may have historically been trained to refer to these concepts as "security key" or "security passphrase", and so user interfaces should provide a way for users to be educated on the terminology change (e.g. a tooltip or help link): e.g. "Your recovery key may also have been referred to as a security key in the past"

⚠️ Be aware that old versions of the spec use "recovery key" to refer to the private half of the backup encryption key, which is different from the usage here. The recovery key described in this section is referred to in the spec as the secret storage key.

Potential issues

Lots of existing clients use a whole variety of different terminology, and many users are familiar with different terms. Nevertheless we believe that working together to agree on a common language is the only way to address this issue over time.

Further work

Several other concepts might benefit from similar treatment. Within cryptography, "device dehydration" is a prime candidate. Outside cryptography, many other terms could be agreed, including "export chat" (particularly in contrast to "export message keys").

Security considerations

In order for good security practices to work, users need to understand the implications of their actions, so this MSC should be reviewed by security experts to ensure it is not misleading.

Dependencies

None

Credits

Written by Andy Balaam, Aaron Thornburgh and Patrick Maier as part of our work for Element. Richard van der Hoff, Matthew Hodgson and Denis Kasak contributed many improvements before the first draft was published.