Last modified: 2013-10-23 10:13:26 UTC
Ever since gallium got upgraded from Lucid to Precise, we have been suffering disk I/O latency. If my memory serves me well, the MediaWiki dump tests went from 30 seconds to 1 minutes. From time to time, we get massive latency on gallium slave. It would be nice to have the disk statistics monitored in Ganglia. That should be possible with ganglia::diskstat introduced by https://gerrit.wikimedia.org/r/#/c/85669/ for bug 36994 This bug is merely a remember to get the diskstat ganglia plugin applied in gallium (and lanthanum).
(In reply to comment #0) > This bug is merely a remember to get the diskstat ganglia plugin applied in > gallium (and lanthanum). Do we need another bug to have it on all production hosts, or at least database hosts? :)
I am not sure what is going the impact of deploying that plugin on all servers. But I surely need it on Jenkins boxes. Feel free to fill a bug (blocked by bug 36994) to request the diskstat plugin to be applied in database. I am sure our DBA will be more than happy to have such metrics :-]
Still pending bug 36994 (deploy the ganglia module making it possible to monitor disk).
Both Jenkins CI slaves (gallium and lanthanum) have diskstat enabled in Ganglia now :-] Thanks to Ori!