Last modified: 2014-06-12 17:20:53 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T65522, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 63522 - Icinga should notice people when /home partition on stat1002 fills up
Icinga should notice people when /home partition on stat1002 fills up
Status: NEW
Product: Analytics
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-04-04 15:00 UTC by christian
Modified: 2014-06-12 17:20 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description christian 2014-04-04 15:00:31 UTC
It seems /home partition on stats1002 filled up between 2014-04-03 and 2014-04-04,
but no one was noticed by Icinga.

I noticed when going through cron-mail and seeing that on 2014-04-04 04:30,
one of my jobs failed with

  No space left on device

for /home/qchris on stat1002.

I freed some GBs for now, but $SOME_SERVICE (Icinga?) should warn in time about
disks getting full.

Let's get $SOME_SERVICE to alert about disks getting full.
Comment 1 Bingle 2014-04-04 15:05:41 UTC
Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1527
Comment 2 Oliver Keyes 2014-04-04 16:09:42 UTC
I think this was also me; not quite sure how. I was messing around with the sampled logs in my home directory, but I'm not sure how that'd correspond. I'm going to investigate now I'm conscious.

Evidently this is the week of Oliver Accidentally Revealing Oversight Issues With Our Cluster :D
Comment 3 christian 2014-04-04 16:46:31 UTC
(In reply to Oliver Keyes from comment #2)
> I think this was also me

Hahaha.
Sorry to disappoint you again, but a "du" on /home showed that it was
not you :-D

But this bug is not about “Who filled up the disk”. Disks will always
get full. Analyses start small, and grow ... and grow ... and
grow. And then the disk is full. Meh.

Much rather, this bug is about “Why did no service warn about disks
getting full?”.
Comment 4 Toby Negrin 2014-04-16 20:16:38 UTC
let's see if we can prioritize this in the next sprint.
Comment 5 Toby Negrin 2014-06-12 17:20:53 UTC
Another issue where we need ops support.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links