139 lines
6.0 KiB
Markdown
139 lines
6.0 KiB
Markdown
# MSC2191: Markup for mathematical messages
|
|
|
|
Some people write using an odd language that has strange symbols. No, I'm not
|
|
talking about computer programmers; I'm talking about mathematicians. In order
|
|
to aid these people in communicating, Matrix should define a standard way of
|
|
including mathematical notation in messages.
|
|
|
|
This proposal presents a format using LaTeX, in contrast with a [previous
|
|
proposal](https://github.com/matrix-org/matrix-doc/pull/1722/) that used
|
|
MathML.
|
|
|
|
See also:
|
|
|
|
- https://github.com/vector-im/riot-web/issues/1945
|
|
|
|
|
|
## Proposal
|
|
|
|
A new attribute `data-mx-maths` will be added for use in `<span>` or `<div>`
|
|
elements. Its value will be mathematical notation in LaTeX format. `<span>`
|
|
is used for inline math, and `<div>` for display math. The contents of the
|
|
`<span>` or `<div>` will be a fallback representation or the desired notation
|
|
for clients that do not support mathematical display, or that are unable to
|
|
render the entire `data-mx-maths` attribute. The fallback representation is
|
|
left up to the sending client and could be, for example, an image, or an HTML
|
|
approximation, or the raw LaTeX source. When using an image as a fallback, the
|
|
sending client should be aware of issues that may arise from the receiving
|
|
client using a different background colour.
|
|
|
|
Example (with line breaks and indentation added to `formatted_body` for clarity):
|
|
|
|
```json
|
|
{
|
|
"content": {
|
|
"body": "This is an equation: sin(x)=a/b",
|
|
"format": "org.matrix.custom.html",
|
|
"formatted_body": "This is an equation:
|
|
<span data-mx-maths=\"\\sin(x)=\\frac{a}{b}\">
|
|
sin(<i>x</i>)=<sup><i>a</i></sup>/<sub><i>b</i></sub>
|
|
</span>",
|
|
"msgtype": "m.text"
|
|
},
|
|
"event_id": "$eventid:example.com",
|
|
"origin_server_ts": 1234567890,
|
|
"sender": "@alice:example.com",
|
|
"type": "m.room.message",
|
|
"room_id": "!soomeroom:example.com"
|
|
}
|
|
```
|
|
|
|
|
|
## Other solutions
|
|
|
|
[MSC1722](https://github.com/matrix-org/matrix-doc/pull/1722/) proposes using
|
|
MathML as the format of transporting mathematical notation. It also summarizes
|
|
some other solutions in its "Other Solutions" section.
|
|
|
|
In comparison with MathML, LaTeX has several advantages and disadvantages.
|
|
|
|
The first advantage, which is quite obvious, is that LaTeX is much less verbose
|
|
and more readable than MathML. In many cases, the LaTeX code is a suitable
|
|
fallback for the rendered notation.
|
|
|
|
LaTeX is a suitable input method for many people, and so converting from a
|
|
user's input to the message format would be a no-op.
|
|
|
|
However, balanced against these advantages, LaTeX has several disadvantages as
|
|
a message format. Some of these are covered in the "Potential issues" and
|
|
"Security considerations".
|
|
|
|
|
|
## Potential issues
|
|
|
|
### "LaTeX" as a format is poorly defined
|
|
|
|
There are several extensions to LaTeX that are commonly used, such as
|
|
AMS-LaTeX. It is unclear which extensions should be supported, and which
|
|
should not be supported. Different LaTeX-rendering libraries support different
|
|
sets of commands.
|
|
|
|
This proposal suggests that the receiving client should render the LaTeX
|
|
version if possible, but if it contains unsupported commands, then it should
|
|
display the fallback. Thus, it is up to the receiving client to decide what
|
|
commands it will support, rather than dictating what commands must be
|
|
supported. This comes at a cost of possible inconsistency between clients, but
|
|
is somewhat mitigated by the use of a fallback. Clients should, however, aim
|
|
to support, at minimum, the basic LaTeX2e maths commands and the TeX maths
|
|
commands, with the possible exception of commands that could be security risks
|
|
(see below).
|
|
|
|
To improve compatibility, the sender's client may warn the sender if they are
|
|
using a command that comes from another package, such as AMS-LaTeX.
|
|
|
|
### Lack of libraries for displaying mathematics
|
|
|
|
see the corresponding section in [MSC1722](https://github.com/matrix-org/matrix-spec-proposals/pull/1722/files#diff-4a271297299040dbfa622bfc6d2aab02f9bc82be0b28b2a92ce30b14c5621f94R148-R164)
|
|
|
|
|
|
## Security considerations
|
|
|
|
LaTeX is a [Turing complete programming
|
|
language](https://web.archive.org/web/20160110102145/http://en.literateprograms.org/Turing_machine_simulator_%28LaTeX%29);
|
|
it is possible to write a LaTeX document that contains an infinite loop, or
|
|
that will require large amounts of memory. While it may be fun to write a
|
|
[LaTeX file that can control a Mars
|
|
Rover](https://wiki.haskell.org/wikiupload/8/85/TMR-Issue13.pdf#chapter.2), it
|
|
is not desirable for a mathematical formula embedded in a Matrix message to
|
|
control a Mars Rover. Clients should take precautions when rendering LaTeX.
|
|
Clients that use a rendering library should only use one that can process the
|
|
LaTeX safely.
|
|
|
|
Clients should not render mathematics by calling the `latex` executable without
|
|
proper sandboxing, as the `latex` executable was not written to handle
|
|
untrusted input. (see, for example, <https://hovav.net/ucsd/dist/texhack.pdf>,
|
|
<https://0day.work/hacking-with-latex/>, and
|
|
<https://hovav.net/ucsd/dist/tex-login.pdf>.) Some LaTeX rendering libraries
|
|
are better suited for processing untrusted input.
|
|
|
|
Certain commands, such as [those that can create
|
|
macros](https://katex.org/docs/supported#macros), are potentially dangerous;
|
|
clients should either decline to process those commands, or should take care to
|
|
ensure that they are handled in safe ways (such as by limiting recursion). In
|
|
general, LaTeX commands should be filtered by allowing known-good commands
|
|
rather than forbidding known-bad commands. Some LaTeX libraries may have
|
|
options for doing this.
|
|
|
|
In general, LaTeX places a heavy burden on client authors to ensure that it is
|
|
processed safely. Some LaTeX rendering libraries provide security advice, for
|
|
example, <https://github.com/KaTeX/KaTeX/blob/main/docs/security.md>.
|
|
|
|
|
|
## Conclusion
|
|
|
|
Math(s) is hard, but LaTeX makes it easier to write mathematical notation.
|
|
However, using LaTeX as a format for including mathematics in Matrix messages
|
|
has some serious downsides. Nevertheless, if clients handle the LaTeX
|
|
carefully, or rely on the fallback representation, the concerns can be
|
|
addressed.
|