Last modified: 2011-11-23 16:51:32 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T34598, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 32598 - API requests to commons frequently return 500
API requests to commons frequently return 500
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-11-22 22:15 UTC by Daniel Schwen
Modified: 2011-11-23 16:51 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Daniel Schwen 2011-11-22 22:15:30 UTC
Similar to bug 30201 API request using pywikipediabot now frequently return error 500. This screws up my Quality Images bot completely as it fails in a half finished state, creating a terrible mess.
Comment 1 Sam Reed (reedy) 2011-11-22 22:20:03 UTC
Example queries etc?
Comment 2 Daniel Schwen 2011-11-22 23:02:45 UTC
What queries? It is a simple page.put() in pywikipedia:

Updating page [[Commons:Quality images/Subject/Places/Man made structures]] via API
HTTPError: 500 Internal Server Error
WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'.
Maybe the server is down. Retrying in 1 minutes...
Comment 3 Roan Kattouw 2011-11-23 11:07:25 UTC
(In reply to comment #2)
> What queries? It is a simple page.put() in pywikipedia:
> 
> Updating page [[Commons:Quality images/Subject/Places/Man made structures]] via
> API
> HTTPError: 500 Internal Server Error
> WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'.
> Maybe the server is down. Retrying in 1 minutes...

Could you get on IRC some time and find me (RoanKattouw)? Then we can do a controlled experiment where you try to reproduce the error while I watch the server logs.
Comment 4 Sam Reed (reedy) 2011-11-23 13:51:52 UTC
(In reply to comment #2)
> What queries? It is a simple page.put() in pywikipedia:
> 
> Updating page [[Commons:Quality images/Subject/Places/Man made structures]] via
> API
> HTTPError: 500 Internal Server Error
> WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'.
> Maybe the server is down. Retrying in 1 minutes...

Right, but you didn't say this in your original post, so you could've been doing one of many different things. We're not mind readers ;)
Comment 5 Daniel Schwen 2011-11-23 14:22:50 UTC
The page in question is very large (3.3Mb). Might this be a problem?

I just went through the logs of the last few weeks and 17 out of 18 times the error happens when the bot is trying to update that particular page! Sometimes it works after a few retries, sometimes all retries fail (pywikipediabot seems to ignore the increased retry count I set).
Comment 6 Roan Kattouw 2011-11-23 14:25:40 UTC
(In reply to comment #5)
> The page in question is very large (3.3Mb). Might this be a problem?
> 
You mean you were trying to save 3.3 megs of *wikitext*? That should just be rejected, because $wgMaxArticleSize is set to 2000 KB.
Comment 7 Daniel Schwen 2011-11-23 15:17:21 UTC
Oops, that scratch that. I was looking the the rendered HTML. Wikitext is 'just' 276K. However the observation still holds, that this particular page causes most of the bot failures (and it is not the first page that is accessed by the bot!)
Comment 8 Roan Kattouw 2011-11-23 15:51:50 UTC
I worked with Daniel to obtain a backtrace; it's an OOM error. Offhand I'd say
there are three explanations for this:
1) the object fetched from memcached is large, e.g. because it includes
metadata (I now realize that the "(tried to allocate 8208 bytes)" part of the error message disproves this)
2) GlobalUsage fetches many file objects, and there's a memory leak somewhere
3) we're in the post-parse LinksUpdate hook while the parser has just finished
parsing a large wiki page, so the wikitext and all sorts of metadata and parser
data structures are still in memory


[23-Nov-2011 15:41:47] Fatal error: Allowed memory size of 125829120 bytes exhausted (tried to allocate 8208 bytes) at /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedClient.php on line 918
Server: srv300
Method: POST
URL: http://commons.wikimedia.org/w/api.php
Cookie: commonswiki_session=CENSORED; commonswikiUserID=126604; commonswikiToken=CENSORED; commonswikiUserName=QICbot;
Backtrace:
#0 /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedClient.php(918): unserialize('a:15:{s:7:"vers...')
#1 /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedClient.php(436): MWMemcached->_load_items(Resource id #146, Array)
#2 /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedPhpBagOStuff.php(63): MWMemcached->get('commonswiki:fil...')
#3 /usr/local/apache/common-local/php-1.18/includes/filerepo/LocalFile.php(184): MemcachedPhpBagOStuff->get('commonswiki:fil...')
#4 /usr/local/apache/common-local/php-1.18/includes/filerepo/LocalFile.php(340): LocalFile->loadFromCache()
#5 /usr/local/apache/common-local/php-1.18/includes/filerepo/LocalFile.php(569): LocalFile->load()
#6 /usr/local/apache/common-local/php-1.18/includes/filerepo/FileRepo.php(126): LocalFile->exists()
#7 /usr/local/apache/common-local/php-1.18/includes/filerepo/FileRepo.php(179): FileRepo->findFile('Case_??_la_chef...', Array)
#8 /usr/local/apache/common-local/php-1.18/extensions/GlobalUsage/GlobalUsageHooks.php(20): FileRepo->findFiles(Array)
#9 [internal function]: GlobalUsageHooks::onLinksUpdateComplete(Object(LinksUpdate))
#10 /usr/local/apache/common-local/php-1.18/includes/Hooks.php(216): call_user_func_array('GlobalUsageHook...', Array)
#11 /usr/local/apache/common-local/php-1.18/includes/GlobalFunctions.php(3626): Hooks::run('LinksUpdateComp...', Array)
#12 /usr/local/apache/common-local/php-1.18/includes/LinksUpdate.php(113): wfRunHooks('LinksUpdateComp...', Array)
#13 /usr/local/apache/common-local/php-1.18/includes/WikiPage.php(2022): LinksUpdate->doUpdate()
#14 /usr/local/apache/common-local/php-1.18/includes/WikiPage.php(1200): WikiPage->doEditUpdates(Object(Revision), Object(User), Array)
#15 [internal function]: WikiPage->doEdit('==Man made stru...', 'sorted into the...', 118)
#16 /usr/local/apache/common-local/php-1.18/includes/Article.php(1921): call_user_func_array(Array, Array)
#17 [internal function]: Article->__call('doEdit', Array)
#18 /usr/local/apache/common-local/php-1.18/includes/EditPage.php(1214): Article->doEdit('==Man made stru...', 'sorted into the...', 118)
#19 /usr/local/apache/common-local/php-1.18/includes/api/ApiEditPage.php(273): EditPage->internalAttemptSave(Array, true)
#20 /usr/local/apache/common-local/php-1.18/includes/api/ApiMain.php(692): ApiEditPage->execute()
#21 /usr/local/apache/common-local/php-1.18/includes/api/ApiMain.php(358): ApiMain->executeAction()
#22 /usr/local/apache/common-local/php-1.18/includes/api/ApiMain.php(342): ApiMain->executeActionWithErrorHandling()
#23 /usr/local/apache/common-local/php-1.18/api.php(115): ApiMain->execute()
#24 /usr/local/apache/common-local/live-1.5/api.php(3): require('/usr/local/apac...')
#25 {main}
Comment 9 Roan Kattouw 2011-11-23 15:58:56 UTC
(In reply to comment #8)
> I worked with Daniel to obtain a backtrace; it's an OOM error. Offhand I'd say
> there are three explanations for this:
> 1) the object fetched from memcached is large, e.g. because it includes
> metadata (I now realize that the "(tried to allocate 8208 bytes)" part of the
> error message disproves this)
> 2) GlobalUsage fetches many file objects, and there's a memory leak somewhere
> 3) we're in the post-parse LinksUpdate hook while the parser has just finished
> parsing a large wiki page, so the wikitext and all sorts of metadata and parser
> data structures are still in memory
> 
Daniel blanked the page in question, upon which the bot edit worked fine. He mentioned the page used to contain a gallery with 7000 images, so I'm now leaning towards theory #2.
Comment 10 Roan Kattouw 2011-11-23 16:51:32 UTC
(In reply to comment #9)
> Daniel blanked the page in question, upon which the bot edit worked fine. He
> mentioned the page used to contain a gallery with 7000 images, so I'm now
> leaning towards theory #2.
Turns out I was right. Fixed in r104049 and deployed.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links