Last modified: 2014-03-04 06:34:33 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T62032, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 60032 - "Pool queue is full" and related errors on search api
"Pool queue is full" and related errors on search api
Status: RESOLVED WONTFIX
Product: MediaWiki
Classification: Unclassified
Search (Other open bugs)
unspecified
Other other
: High normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: mobile
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-01-14 09:05 UTC by Simon de Haan
Modified: 2014-03-04 06:34 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Graphite screenshot showing timeouts (197.34 KB, image/png)
2014-01-14 09:15 UTC, Simon de Haan
Details

Description Simon de Haan 2014-01-14 09:05:13 UTC
For Wikipedia Text (the SMS & USSD part of Wikipedia Zero) in Kenya we're occasionally seeing the following errors:

{u'servedby': u'mw1205', u'error': {u'info': u'Pool queue is full', u'code': u'srsearch-error'}}
{u'servedby': u'mw1200', u'error': {u'info': u'The search backend returned an error: ', u'code': u'srsearch-error'}}
{u'servedby': u'mw1123', u'error': {u'info': u'HTTP request timed out.', u'code': u'srsearch-error'}}

Please advise if anything can be done about these (other than trying to handle them as gracefully as possible on our application side of things).
Comment 1 Simon de Haan 2014-01-14 09:15:21 UTC
Created attachment 14305 [details]
Graphite screenshot showing timeouts

Screenhot of the Wikipedia Text service traffic. The yellow line is the response time for the search API. Times shown is in UTC morning of tuesday 14th of January.
Comment 2 Andre Klapper 2014-01-14 10:24:26 UTC
Thanks for taking the time to report this!

Is this a recent problem, or has this been ongoing for a while?

Bug 59993 might be related (which got fixed yesterday).
Comment 3 Simon de Haan 2014-01-14 10:38:20 UTC
Bug 59993 looks related but these errors were of this morning, after it had been resolved.

We see it happening fairly regularly when the Kenyan mobile network operator does a big SMS based announcement of Wikipedia Text resulting in increased traffic volumes.
Comment 4 Nik Everett 2014-01-14 12:58:35 UTC
Which Wikis are generating these messages?
Comment 5 Simon de Haan 2014-01-14 13:08:05 UTC
This is from our API calls to the search API on the English Wikipedia. The API calls are being generated from the Vumi Wikipedia app which handles the SMS & USSD component of Wikipedia Zero.
Comment 6 Nik Everett 2014-01-14 15:30:31 UTC
I don't see any load spikes on the search backends but I do see latency spikes on the pool counters:
https://gdash.wikimedia.org/dashboards/poolcounter/
not 60 seconds though.  My 99th percentile times are about 6 seconds.  I suppose if you are searching for 10 items and trigger the pool counter for each one _and_ all of them hit the 99th percentile case then you could see this.

The backends have a log of all the searches they've performed and how long they take.  Could you send me some examples of slow searches and I can rule out that the backends are taking forever one them?
Comment 7 Yuri Astrakhan 2014-01-15 04:06:26 UTC
Simon, would it make sense to test the other search backend? In http://en.wikipedia.org/w/api.php?action=help&modules=query+search i see srbackend parameter, which currently defaults to "LuceneSearch", but it also supports "CirrusSearch". I presume that CirrusSearch is our new backend that will become the default shortly.
Comment 8 Andre Klapper 2014-02-27 16:31:02 UTC
Simon: Could you answer comment 7, please?
Comment 9 Nik Everett 2014-02-27 16:35:01 UTC
"Shortly" is relative.  You can try the new search backend if you like.  Let me know if you want to set it as the default so I can see how we handle the extra traffic.
Comment 10 Simon de Haan 2014-03-04 06:34:16 UTC
Created https://github.com/praekelt/vumi-wikipedia/issues/40 so we'll switch over to the new search backend. Will re-open this is the issue re-occurs.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links