Last modified: 2014-05-03 17:15:13 UTC
Currently it's almost impossible to detect trends as 5xx errors are buried at the feet of the peaks of gazillions 500 errors. Perhaps it should also have a month-old report but that's less important. (I wanted to check the amount of 504 errors in the last few days/weeks.)
Created attachment 13762 [details] Example (not representative) of reqerror graph Note, today's graph is not representative because packet loss between esams and eqiad caused a huge peak of 503 errors. Something hints me that I should try this myself. :) 17.58 < ori-l> yes, you can submit a patch yourself, let me point you to the right file 18.00 < ori-l> Nemo_bis: http://git.wikimedia.org/tree/operations%2Fpuppet.git/ca6fe4efc30c6a4b2606b13aab178b9e71914dca/files%2Fgraphite%2Fgdash%2Fdashboards%2Freqerror 18.01 < ori-l> the DSL gdash uses to describe graphs is at https://github.com/ripienaar/graphite-graph-dsl/wiki
Change 95064 had a related patch set uploaded by Nemo bis: Use log scale for 5xx errors in "(cdn) HTTP Error Rate" https://gerrit.wikimedia.org/r/95064
Change 95068 had a related patch set uploaded by Nemo bis: Also add 2 months and 1 year graphs in "(cdn) HTTP Error Rate" https://gerrit.wikimedia.org/r/95068
Change 95064 merged by Ori.livneh: Use log scale for 5xx errors in "(cdn) HTTP Error Rate" https://gerrit.wikimedia.org/r/95064
Change 95068 merged by Ori.livneh: Also add 2 months and 1 year graphs in "(cdn) HTTP Error Rate" https://gerrit.wikimedia.org/r/95068
Apart from the wrong log scale setting (which needs to be per-graph), I probably also have to fix the x axis scale, because it seems with the current one it needs an extremely long image to actually show the data.
Change 101065 had a related patch set uploaded by Nemo bis: Make logscale in reqerror graphs actually work https://gerrit.wikimedia.org/r/101065
Change 101065 merged by Ori.livneh: Make logscale in reqerror graphs actually work https://gerrit.wikimedia.org/r/101065
Change 105614 had a related patch set uploaded by Nemo bis: Also logbase 2 for the shorter reqerror graphs https://gerrit.wikimedia.org/r/105614
Change 105614 merged by Ori.livneh: Also logbase 2 for the shorter reqerror graphs https://gerrit.wikimedia.org/r/105614
(In reply to comment #10) > Change 105614 merged by Ori.livneh: > Also logbase 2 for the shorter reqerror graphs > > https://gerrit.wikimedia.org/r/105614 That worked for a bit, but then regressed again. Will need to check.
Change 117021 had a related patch set uploaded by Nemo bis: [gdash] Use logscale 10 for reqerror graph, again https://gerrit.wikimedia.org/r/117021
I still have no idea where to start to find a way that works...