Last modified: 2013-11-22 22:00:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T53983, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 51983 - [OPS] gerrit & gitblit needs process monitoring in Icinga
[OPS] gerrit & gitblit needs process monitoring in Icinga
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Git/Gerrit (Other open bugs)
wmf-deployment
All All
: High enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
: ops
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-24 18:35 UTC by Antoine "hashar" Musso (WMF)
Modified: 2013-11-22 22:00 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Antoine "hashar" Musso (WMF) 2013-07-24 18:35:19 UTC
gitblit is hosted on antinomy.wikimedia.org which only have the default checks: puppet freshness, NTP and SSH.

https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=antimony

It would need a check that monitor whether gitblit is running.

templates/icinga/nrpe_local.cfg.erb has a bunch of examples if you look for 'java'. An example for Jenkins:


command[check_jenkins]=/usr/lib/nagios/plugins/check_procs -w 1:1 -c 1:1 --ereg-argument-array '^/usr/bin/java .*-jar /usr/share/jenkins/jenkins.war'


Which make sure there is one and only one java process with jenkins.war.
Comment 1 Chad H. 2013-07-24 20:38:36 UTC
Widening summary, want this for Gerrit too.
Comment 2 Gerrit Notification Bot 2013-07-24 20:39:24 UTC
Change 75777 had a related patch set uploaded by Demon:
Add icinga monitoring for Gerrit and Gitblit

https://gerrit.wikimedia.org/r/75777
Comment 3 Greg Grossmeier 2013-08-13 21:16:13 UTC
Setting importance to High cause, well, gerrit ang gitblit have been needing a bit of hand holding lately.
Comment 4 Antoine "hashar" Musso (WMF) 2013-10-18 10:34:58 UTC
Pinged Chad / Leslie by email to move this forward.
Comment 5 Rob Lanphier 2013-11-07 00:01:10 UTC
Unassigning from Chad.  We need someone with Puppet and firewall rule writing expertise to finish this off.
Comment 6 Antoine "hashar" Musso (WMF) 2013-11-18 23:46:58 UTC
Filled https://rt.wikimedia.org/Ticket/Display.html?id=6342 to apply the ferm system on the servers hosting Gerrit/Gitblit (antinomy and manganese) and enable monitoring ( https://gerrit.wikimedia.org/r/#/c/75777/ ).
Comment 7 Gerrit Notification Bot 2013-11-22 09:40:09 UTC
Change 75777 merged by Akosiaris:
Add icinga monitoring for Gerrit and Gitblit

https://gerrit.wikimedia.org/r/75777
Comment 8 Chad H. 2013-11-22 19:11:52 UTC
This is now in place, all checks look green. Should get appropriate alerts now when things go down badly :)
Comment 9 Antoine "hashar" Musso (WMF) 2013-11-22 22:00:36 UTC
Thank you everyone!

Alexandros, Ariel and David Zahn have been very helpful adding the ferm firewall configuration.  Hurrah!

Ideally we would validate the monitoring are working properly by shutting down gitblit and Gerrit and confirm warnings are issued.  But I might be too meticulous.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links