7.5 KiB
Proposal for storing an encrypted recovery key on the server to aid recovery of megolm key backups
Problem
MSC1219 proposes an API for optionally storing encrypted megolm keys on your homeserver, so if a user loses all their devices, they can still recover their history. The megolm keys are public-key encrypted using a private Curve25519 key that only the end-user has.
However, there are usability concerns about users having to store their Curve25519 recovery private key in a secure manner. Casual users are likely to be scared away by having to file away a relatively long (e.g. 10 word) generated recovery key.
We would like to give the user the option to access their key backup using a passphrase in addition to their recovery key. We can take inspiration from Apple’s FileVault 2 where Apple store encrypted copies of your FileVault AES key on your hard disk, encrypted by your UNIX account password, or a passphrased SSH private key on a server for convenience.
Proposed solution
Three solutions are given here (two of which are viable, one included for completeness), varying in the implications of the user changing their passphrase.
Option 1 has been chosen, on the basis that we do not require the user to be able to change their passphrase without also changing their recovery key.
Recovery Key
In all options below, the process for generating a recovery key from a byte string, b is as follows:
- Prepend the two bytes 0x8B, 0x01 to the byte string b
- Compute a parity bit by XORing all bytes of the resulting string (ie. prefix
byte string
)
- Append the parity byte to the prefix + b
- base58 encode the resulting byte string with alphabet '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'.
- Format the resulting ASCII string into groups of 4 characters separated by spaces.
Option 1
The user provides a passphrase, P. The client generates the backup encryption private key, K-1 by running PBKDF on this passphrase. The PBKDF parameters are stored in the auth_data of the key backup under 'private_key_salt' and 'private_key_iterations' keys, respectively:
{
[...]
"private_key_salt": "MmMsAlty",
"private_key_iterations": 100000
}
The backup public encryption key, K, is determined by running the curve25519 function on K-1 with basepoint {9}. The recovery key is then generated by encoding K-1 as above.
To change the passphrase, a client creates a completely new backup version, performing the steps above with the new passphrase. The client then re-encrypts all sessions keys and uploads them to the new backup. The user will always get a new recovery key whenever they change their passphrase.
In this option, the recovery key is generated directly from the passphrase using PBKDF. This means the ciphertext of the backed up keys is more vulnerable to dictionary attacks. Option 2b attempts to offer a mitigation against this.
Option 2a
The backup encryption private key, K-1 is generated by a secure
random number generator. A private key, K-1p is generated
by running PBKDF on the passphrase. K-1p' is generated by
XORing K-1 with K-1p.
K-1p' is stored on the along with the key backup in the
private_key
object above. The recovery key is generated by encoding
K-1 as above.
To change the passphrase, the client generates the new K-1p from the new passphrase then computes a new K-1p'. It then updates the backup information with this new K-1p'.
This would require the API to support updating the metadata stored with a backup (or the key parameters to be stored elsewhere, eg. in account data).
This option, however, allows the server to obtain K-1 by obtaining any one of the users previous passphrases, assuming it keeps copies of the previous versions of the key parameters. This option is therefore not viable, but included for completeness.
Option 2b
A variant on option 2a is to regenerate K-1 when the passphrase is changed, meaning the recovery does change when the passphrase is changed, making it identical feature-wise to option 1 and without the problem of any previous passphrase being sufficient to obtain K-1. It differs, however, in that K-1 is generated randomly and therefore not vulnerable to dictionary attacks. However, K-1p is still vulnerable to dictionary attacks and is stored in the same place with the same protection, and, if compromised, gives access to K-1. This option therefore offers no significant security benefit over option 1.
Option 3
The backup encryption private key, K-1, and a private,
passphrase-derived key, K-1p are generated as above.The
passphrase key counterpart, K-1p', is also generated as
above from the K-1 XOR K-1p. Another private
key, K-1r is generated also by a secure random number
generator and encoded to give the recovery key as above.
K-1r' is generated by XORing K-1r
with K-1. Both K-1p' and
K-1r' are stored in the private_key
in the backup under
keys passphrase_counterpart
and recovery_key_counterpart
respectively.
To change the passphrase, the client starts a new backup version as in option 1 (generating a new K-1), but additionally computes a new K-1r' by XORing K-1r with the new K-1. This refreshes all keys, but allows the user to keep the same recovery key for their backup, on the assumption that the recovery key itself has not been compromised. If it has, the client generates a new backup with a completely fresh recovery key instead.
Security considerations
The proposal above is vulnerable to a malicious server admin performing a dictionary attack against the encrypted passphrases stored on their server to access history. (It's worth bearing in mind that the server admin can also always hijack its user's accounts; the thing that stopping them from impersonating their users is E2E device verification.)
Possible extensions
In future, we could consider supporting authenticating users for login based on their encrypted passphrase, meaning that users only have to remember one password for their Matrix account rather than a login password and a history-access passphrase. However, this of course exposes the user's whole E2E history to the risk of dictionary attacks by public attackers (i.e. not just server admins), keysniffer-at-login attacks or clients which are lazy about storing account passwords securely. There's also a risk that because login passwords are much more commonly entered than history passwords, they might encourage users to force a weaker password. It's unclear whether this reduction in security-in-depth is worth the UX benefits of a single master password, so we suggest checking how this proposal goes first (given in general we expect key recovery to happen by cross-verifying devices at login rather than by entering a recovery key or passphrase).
See also:
Notes from discussing this IRL are at https://docs.google.com/document/d/11fF1rbX5eTkrfxXRS8UhpW5sBENOCydYlLWzB8X1IuU/edit