Last modified: 2014-08-14 20:48:15 UTC
They used to show up in gdash when you ticked the "Show Code Deploys" checkbox. adding "&target=drawAsInfinite(deploy.any)" to the graphite urls doesn't work :/
See also https://rt.wikimedia.org/Ticket/Display.html?id=6970
Change 119339 had a related patch set uploaded by BryanDavis: Fix MW_STATSD_PORT to point to correct listener https://gerrit.wikimedia.org/r/119339
Change 119340 had a related patch set uploaded by BryanDavis: Fix statsd_port value https://gerrit.wikimedia.org/r/119340
After these patches land and we get some data in graphite again I think we'll need to look at the gdash configuration and update the metric names that it uses to identify deployments as well. deploy2graphite and scap send different metrics to graphite.
Change 119340 merged by jenkins-bot: Fix statsd_port value https://gerrit.wikimedia.org/r/119340
Change 119339 merged by Ori.livneh: Fix MW_STATSD_PORT to point to correct listener https://gerrit.wikimedia.org/r/119339
See https://gerrit.wikimedia.org/r/#/c/111409/ for the change from carbon to statsd that should have been accompanied by a change to the gdash configuration and port number as well.
When we figure out what all the new deploy metrics are they should be added to templates/gdash/deploy_addon.erb in oeprations/puppet.git to fix the marks added.
There is some additional problem with the current gdash configuration. When the "Show Code Deploys" checkbox is active, something is causing the generated graphite URLs to contain an extraordinary number of superfluous ampersands. In one URL I just examined there are 4188 extra ampersands inserted between the deployment metric stanzas and the remainder of the graph description. When these ampersands are removed from the graphite URL the graph renders (albeit with no deploy markers).
For what it's worth, I was seeing graphite urls like that (tons of &s) on Friday the 14th.
The configuration changes now have data being recorded in graphite for scap runs again, but there are three remaining issues: 1) The metric names have changed. The gdash configuration is looking to add the metrics "deploy.sync-common-file", "deploy.sync-common-all" and "deploy.scap" to the graph. With the change from direct carbon communication to statsd and the changes to scap code, these metric names have changed. "scap.scap.count" should be the equivalent of the old "deploy.scap" metric. 2) In theory the metrics for "deploy.sync-common-file" and "deploy.sync-common-all" should just need a ".count" added to them, but I'm not currently seeing metrics with those names in graphite at all. 3) The txstatsd recorded stats for "scap.scap.count" don't look right at all. I would expect graphite to be recording the aggregate sum of the "scap.scap:1|c" calls seen in the last minute which would typically be 0 and occasionally be 1 (or possibly 2 with aborted scaps). Instead it seems to be recording a value of 1.0 every minute with occasional values of 5.0 that are not correlated with other scap logging output. [0] [0]: https://graphite.wikimedia.org/render?from=23%3A00_20140331&until=00%3A00_20140401&target=scap.scap.count&format=json
Assigning to Ori in the hope that he can find some time to look into the txstatsd behavior and the missing metrics. Once those issues are fixed it should be pretty easy to correct the gdash configuration.