matrix-doc/proposals/4120-head-method-download.md

72 lines
4.2 KiB
Markdown

# MSC4120: Allow `HEAD` on `/download`
Most servers have a media upload size limit in place which gets applied to remote downloads as well,
ideally preventing "excessively" large media from transiting through the server. Unfortunately, the
best way to prevent large media from being downloaded is to try downloading it.
Many HTTP client libraries support reading headers before the remaining response body, though this is
time consuming and prone to issues. Some libraries do not offer the functionality at all, and require
the body to be processed by the caller. Other libraries buffer the response body while the caller
determines if it should continue with the request, though this buffer is typically minimal.
To prevent this exact issue, HTTP has the [`HEAD`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/HEAD)
request method which acts like a `GET` request, except the response body is omitted. This proposal
introduces `HEAD` as a legal method for download requests, allowing a requesting server to make
decisions about whether to download the entire file in a subsequent request.
## Proposal
`HEAD` becomes a legal request method on the following endpoints:
* [`/_matrix/media/v3/download/:serverName/:mediaId`](https://spec.matrix.org/v1.9/client-server-api/#get_matrixmediav3downloadservernamemediaid)
* [`/_matrix/media/v3/download/:serverName/:mediaId/:fileName`](https://spec.matrix.org/v1.9/client-server-api/#get_matrixmediav3downloadservernamemediaidfilename)
`HEAD` behaves as described by the HTTP specification.
Servers which do not support the `HEAD` method on the endpoints would respond with a 405 `M_UNRECOGNIZED`
error code, as per the [common error codes spec](https://spec.matrix.org/v1.9/client-server-api/#common-error-codes).
In this case, requesting servers will likely have to take a risk and call the `GET` endpoint without
knowing how much data there is to download.
In future when the media download endpoint is split into client and federation versions, like in
[MSC3916](https://github.com/matrix-org/matrix-spec-proposals/pull/3916), it is suggested that *both*
APIs get the same `HEAD` method support. This will allow clients to check cache headers, and still
provide servers with information about file size.
**Note**: `HEAD` is not supported on `/thumbnail` as the thumbnail may be generated at the time of
request and have unknown size. `/download` does not typically have this issue, unless a form of streaming
file transfer is used, like [MSC4016](https://github.com/matrix-org/matrix-spec-proposals/pull/4016).
## Potential issues
Adding a round trip to the already-expensive download sequence isn't great and may be an over-optimization
for what is usually a rare problem.
## Alternatives
As mentioned in the introduction, requesting servers could abort their request after receiving headers
and possibly part of the body. This may be difficult to do with some libraries/languages, and can still
result in higher-than-ideal bandwidth usage.
## Security considerations
Servers should note that while the HTTP spec [suggests](https://www.rfc-editor.org/rfc/rfc9110.html#name-head)
that a `HEAD` request have the same headers as a `GET` request, the `HEAD` request is notably capable
of lacking useful headers like `Content-Length`. Additionally, a malicious server *could* lie about
the download size on `HEAD` and return a larger file on `GET`. Servers should continue to limit `GET`
requests as best they can to stay within their size limits and bandwidth requirements, particularly
when the `HEAD` request doesn't contain a `Content-Length` header.
## Unstable prefix
This proposal *could* have an unstable prefix by versioning the endpoints themselves, however as the
HTTP feature is well defined and no servers appear to be using `HEAD` requests currently, this proposal
does not include an unstable prefix. Servers should implement `HEAD` as described by the HTTP specification,
but only call other servers with `HEAD` if in an experimental or unstable mode of operation. For example,
if the Synapse configuration has the `HEAD` feature flag *disabled* then no `HEAD` request should be
generated by that Synapse instance.
## Dependencies
This proposal doesn't work very well without [MSC4138](https://github.com/matrix-org/matrix-spec-proposals/pull/4138).