matrix.org/static/jira/browse/BOTS-65

69 lines
2.7 KiB
Plaintext

---
summary: MatrixBridge ping timed out and didn't reconnect
---
created: 2015-06-26 10:23:56.0
creator: neb
description: |-
Submitted by @kegan:matrix.org
No evidence in logs of receiving a close event. Should do manual keep-alives.
IRC:
{code}
03:31 * MatrixBridge has quit (Ping timeout: 265 seconds)
{code}
Logs:
{code} (UTC so -1 hour)
Last received I->M:
2015-06-26 02:22:37 INFO:req [kfp2bfi4fk0ww] [I->M] onMessage: irc.freenode.net [...]
{code}
id: '11706'
key: BOTS-65
number: '65'
priority: '1'
project: '10101'
reporter: neb
resolution: '1'
resolutiondate: 2015-06-26 11:58:41.0
status: '5'
type: '1'
updated: 2015-06-26 11:58:41.0
votes: '0'
watches: '2'
workflowId: '11807'
---
actions:
- author: kegan
body: |-
So the bot is automatically responding to {{ping}} requests with {{pong}}. However, there is no mechanism to knife the connection if it hasn't received anything (incl pings) for a while. This should probably be added. Quick test on freenode shows a ping rate of roughly every 2 minutes when no traffic is passing through:
{code}
2015-06-26 09:55:57 DEBUG:irc-server RECV ping => testmatrixbot => wolfe.freenode.net
2015-06-26 09:57:12 DEBUG:irc-server RECV ping => M-example => leguin.freenode.net
2015-06-26 09:58:21 DEBUG:irc-server RECV ping => testmatrixbot => wolfe.freenode.net
2015-06-26 09:59:18 DEBUG:irc-server RECV ping => M-example => leguin.freenode.net
2015-06-26 10:00:45 DEBUG:irc-server RECV ping => testmatrixbot => wolfe.freenode.net
{code}
Pings are only sent when the client in question doesn't send anything for a while, so we need to take into account the last active times of the client when determining whether to knife or not.
created: 2015-06-26 11:08:34.0
id: '11920'
issue: '11706'
type: comment
updateauthor: kegan
updated: 2015-06-26 11:08:34.0
- author: kegan
body: |-
This has been resolved by applying the following:
- Start a timer when connecting for {{PING_TIMEOUT_MS}} (set to 6 mins currently).
- Listen for {{ping}} events from the server. Refresh the ping timeout timer.
- Decorate the {{client.send}} method to refresh the ping timeout timer every time it is invoked.
- If the timer expires, emit a {{netError}} (which all clients respond to by reconnecting).
This fixes the problem because the bot's TCP connection died without the bot knowing. RFC 1459 states that there should always be the occasional ping/pong to keep the connection alive. If there is no ping for a while, we'll now know about it and emit an error so we can reconnect.
created: 2015-06-26 11:58:27.0
id: '11921'
issue: '11706'
type: comment
updateauthor: kegan
updated: 2015-06-26 11:58:27.0