Last modified: 2013-11-22 17:14:05 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T53350, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 51350 - solr stress out the NFS server


Summary:	solr stress out the NFS server

Status:	RESOLVED WORKSFORME

Product:	Wikimedia Labs
Classification:	Unclassified
Component:	Infrastructure (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Unprioritized normal
Target Milestone:	---
Assigned To:	Ryan Lane

URL:
Whiteboard:
Keywords:	performance

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2013-07-15 10:45 UTC by Antoine "hashar" Musso (WMF)
Modified:	2013-11-22 17:14 UTC (History)
CC List:	4 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
network usage of solr project between 7/13/2013 8:00 and 7/13/2013 20:00 (14.21 KB, image/png) 2013-07-15 10:47 UTC, Antoine "hashar" Musso (WMF)	Details
Ganglia CPU report of labstore 3 between 7/13/2013 8:00 and 7/13/2013 20:00 (15.20 KB, image/png) 2013-07-15 10:47 UTC, Antoine "hashar" Musso (WMF)	Details
Add an attachment (proposed patch, testcase, etc.)

Description Antoine "hashar" Musso (WMF) 2013-07-15 10:45:37 UTC

This morning we found out both beta and tools projects were "slow". The root cause is some jobs running on the solr project which exhaust the NFS server (labstore3) I/O operations.

The workaround is to reboot the instance solr-mw2.pmtpa.wmflabs to disable the stressing job as instructed by Nikolas Everett http://lists.wikimedia.org/pipermail/labs-l/2013-July/001381.html


The Solr experiment should be run on a different NFS system than the shared one. I guess a dedicated one.

Comment 1 Antoine "hashar" Musso (WMF) 2013-07-15 10:47:24 UTC

Created attachment 12846 [details]
network usage of solr project between 7/13/2013 8:00 and 7/13/2013 20:00

Comment 2 Antoine "hashar" Musso (WMF) 2013-07-15 10:47:54 UTC

Created attachment 12847 [details]
Ganglia CPU report of labstore 3 between 7/13/2013 8:00 and 7/13/2013 20:00

Comment 3 Ariel T. Glenn 2013-07-15 11:15:54 UTC

So in accordance with the above I rebooted solr-mw2 and load on labstore3 looks much better.

Comment 4 Nik Everett 2013-07-15 13:33:09 UTC

I should let everyone know what I'm doing:

I'm loading a copy of enwiki with all the current text (as of some backup) so I can index it.  No historical revisions as we won't be indexing them.  I've been told that I can't use a prod replica because it won't contain any text.

What is actually killing the NFS server is mysqld, not solr, elasticsearch, or any other new system.  I make no claims that those systems wouldn't put a similar load on nfs at some point in the future though.

I'd run this on my local system then I wouldn't be able to properly interact with other systems in labs that I need for the experiment.

Comment 5 Antoine "hashar" Musso (WMF) 2013-11-22 17:14:05 UTC

Closing this since Elastic Search has been deployed in production so I guess there is less need nowadays to load huge amount of data in labs instance.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links