Last modified: 2011-11-23 16:51:32 UTC
Similar to bug 30201 API request using pywikipediabot now frequently return error 500. This screws up my Quality Images bot completely as it fails in a half finished state, creating a terrible mess.
Example queries etc?
What queries? It is a simple page.put() in pywikipedia: Updating page [[Commons:Quality images/Subject/Places/Man made structures]] via API HTTPError: 500 Internal Server Error WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server is down. Retrying in 1 minutes...
(In reply to comment #2) > What queries? It is a simple page.put() in pywikipedia: > > Updating page [[Commons:Quality images/Subject/Places/Man made structures]] via > API > HTTPError: 500 Internal Server Error > WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. > Maybe the server is down. Retrying in 1 minutes... Could you get on IRC some time and find me (RoanKattouw)? Then we can do a controlled experiment where you try to reproduce the error while I watch the server logs.
(In reply to comment #2) > What queries? It is a simple page.put() in pywikipedia: > > Updating page [[Commons:Quality images/Subject/Places/Man made structures]] via > API > HTTPError: 500 Internal Server Error > WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. > Maybe the server is down. Retrying in 1 minutes... Right, but you didn't say this in your original post, so you could've been doing one of many different things. We're not mind readers ;)
The page in question is very large (3.3Mb). Might this be a problem? I just went through the logs of the last few weeks and 17 out of 18 times the error happens when the bot is trying to update that particular page! Sometimes it works after a few retries, sometimes all retries fail (pywikipediabot seems to ignore the increased retry count I set).
(In reply to comment #5) > The page in question is very large (3.3Mb). Might this be a problem? > You mean you were trying to save 3.3 megs of *wikitext*? That should just be rejected, because $wgMaxArticleSize is set to 2000 KB.
Oops, that scratch that. I was looking the the rendered HTML. Wikitext is 'just' 276K. However the observation still holds, that this particular page causes most of the bot failures (and it is not the first page that is accessed by the bot!)
I worked with Daniel to obtain a backtrace; it's an OOM error. Offhand I'd say there are three explanations for this: 1) the object fetched from memcached is large, e.g. because it includes metadata (I now realize that the "(tried to allocate 8208 bytes)" part of the error message disproves this) 2) GlobalUsage fetches many file objects, and there's a memory leak somewhere 3) we're in the post-parse LinksUpdate hook while the parser has just finished parsing a large wiki page, so the wikitext and all sorts of metadata and parser data structures are still in memory [23-Nov-2011 15:41:47] Fatal error: Allowed memory size of 125829120 bytes exhausted (tried to allocate 8208 bytes) at /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedClient.php on line 918 Server: srv300 Method: POST URL: http://commons.wikimedia.org/w/api.php Cookie: commonswiki_session=CENSORED; commonswikiUserID=126604; commonswikiToken=CENSORED; commonswikiUserName=QICbot; Backtrace: #0 /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedClient.php(918): unserialize('a:15:{s:7:"vers...') #1 /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedClient.php(436): MWMemcached->_load_items(Resource id #146, Array) #2 /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedPhpBagOStuff.php(63): MWMemcached->get('commonswiki:fil...') #3 /usr/local/apache/common-local/php-1.18/includes/filerepo/LocalFile.php(184): MemcachedPhpBagOStuff->get('commonswiki:fil...') #4 /usr/local/apache/common-local/php-1.18/includes/filerepo/LocalFile.php(340): LocalFile->loadFromCache() #5 /usr/local/apache/common-local/php-1.18/includes/filerepo/LocalFile.php(569): LocalFile->load() #6 /usr/local/apache/common-local/php-1.18/includes/filerepo/FileRepo.php(126): LocalFile->exists() #7 /usr/local/apache/common-local/php-1.18/includes/filerepo/FileRepo.php(179): FileRepo->findFile('Case_??_la_chef...', Array) #8 /usr/local/apache/common-local/php-1.18/extensions/GlobalUsage/GlobalUsageHooks.php(20): FileRepo->findFiles(Array) #9 [internal function]: GlobalUsageHooks::onLinksUpdateComplete(Object(LinksUpdate)) #10 /usr/local/apache/common-local/php-1.18/includes/Hooks.php(216): call_user_func_array('GlobalUsageHook...', Array) #11 /usr/local/apache/common-local/php-1.18/includes/GlobalFunctions.php(3626): Hooks::run('LinksUpdateComp...', Array) #12 /usr/local/apache/common-local/php-1.18/includes/LinksUpdate.php(113): wfRunHooks('LinksUpdateComp...', Array) #13 /usr/local/apache/common-local/php-1.18/includes/WikiPage.php(2022): LinksUpdate->doUpdate() #14 /usr/local/apache/common-local/php-1.18/includes/WikiPage.php(1200): WikiPage->doEditUpdates(Object(Revision), Object(User), Array) #15 [internal function]: WikiPage->doEdit('==Man made stru...', 'sorted into the...', 118) #16 /usr/local/apache/common-local/php-1.18/includes/Article.php(1921): call_user_func_array(Array, Array) #17 [internal function]: Article->__call('doEdit', Array) #18 /usr/local/apache/common-local/php-1.18/includes/EditPage.php(1214): Article->doEdit('==Man made stru...', 'sorted into the...', 118) #19 /usr/local/apache/common-local/php-1.18/includes/api/ApiEditPage.php(273): EditPage->internalAttemptSave(Array, true) #20 /usr/local/apache/common-local/php-1.18/includes/api/ApiMain.php(692): ApiEditPage->execute() #21 /usr/local/apache/common-local/php-1.18/includes/api/ApiMain.php(358): ApiMain->executeAction() #22 /usr/local/apache/common-local/php-1.18/includes/api/ApiMain.php(342): ApiMain->executeActionWithErrorHandling() #23 /usr/local/apache/common-local/php-1.18/api.php(115): ApiMain->execute() #24 /usr/local/apache/common-local/live-1.5/api.php(3): require('/usr/local/apac...') #25 {main}
(In reply to comment #8) > I worked with Daniel to obtain a backtrace; it's an OOM error. Offhand I'd say > there are three explanations for this: > 1) the object fetched from memcached is large, e.g. because it includes > metadata (I now realize that the "(tried to allocate 8208 bytes)" part of the > error message disproves this) > 2) GlobalUsage fetches many file objects, and there's a memory leak somewhere > 3) we're in the post-parse LinksUpdate hook while the parser has just finished > parsing a large wiki page, so the wikitext and all sorts of metadata and parser > data structures are still in memory > Daniel blanked the page in question, upon which the bot edit worked fine. He mentioned the page used to contain a gallery with 7000 images, so I'm now leaning towards theory #2.
(In reply to comment #9) > Daniel blanked the page in question, upon which the bot edit worked fine. He > mentioned the page used to contain a gallery with 7000 images, so I'm now > leaning towards theory #2. Turns out I was right. Fixed in r104049 and deployed.