Last modified: 2014-04-15 16:39:44 UTC
If the slow log has grown by more than 10 lines in the past few minutes then that is a big deal and I should get involved. Nagios should complain.
We'll also have to monitor for absolute failure count, especially if we're successful in installing more timeouts.
Nik: Still highest priority (as it's been like that for five weeks now)?
Ops is doing this "now": https://gerrit.wikimedia.org/r/#/c/123466/ The patch is in review so I'll shift it to that status. They don't normally do bugzilla so I'll manage the link.
This has been merged. We're still working out some of the kinks, but it is there.