Last modified: 2014-03-18 09:23:47 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T59210, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 57210 - CirrusSearch: Improve elasticsearch monitoring
CirrusSearch: Improve elasticsearch monitoring
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
CirrusSearch (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-18 21:40 UTC by Nik Everett
Modified: 2014-03-18 09:23 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Nik Everett 2013-11-18 21:40:51 UTC
Right now icinga spews out a huge blob of json when there is an Elasticsearch problem.  That is difficult to read.
Comment 1 Nik Everett 2013-11-18 21:45:09 UTC
Also we should warn if there are ever fewer than 3 lucene indexes active per shard.
Comment 2 Nik Everett 2013-11-18 22:10:30 UTC
It'd be nice if this could detect a split brain as well.

It'd be really nice if this warned on the elasticsearch cluster as a whole rather than individual hosts....  It should still complain if it can't read a host but not once per host once for issues that affect the whole cluster.
Comment 3 Greg Grossmeier 2013-11-19 05:16:20 UTC
From Antoine:
There is a plugin to monitor clusters. Use case, doc, examples at:
 http://docs.icinga.org/latest/en/clusters.html
 https://www.nagios-plugins.org/doc/man/check_cluster.html

The idea is to create a service that is based on the result of other
services.
Comment 4 Nik Everett 2013-11-27 17:54:48 UTC
Removing from the list of bugs required to reenable Cirrus as it was really for ops and ops doesn't seem to be jumping up and down about it.  I'm leaving it filed as NORMAL and I've got the process started.  We'll get this, but not before next week.
Comment 5 Nemo 2014-03-18 09:23:47 UTC
Is this bug (and its friends in see also) a blocker for expanding Cirrus on the wikis which were already indexed? It would be really nice to make it default on, say, all Wiktionaries or all Wikiquotes and see what happens to the load.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links