Last modified: 2013-08-26 22:32:19 UTC
DjVu files sometimes have massive metadata that OOMs apaches when doing relatively mundane things. There are hacks to avoid caching this large info in RepoGroup, but problems still persist. For example, one category is unviewable: Server: mw1170 Method: GET URL: http://commons.wikimedia.org/wiki/Category:1877_books Cookie: wikiEditor-0-booklet-Edittools-page=Edittools1; centralnotice_bucket=0-4.2 Backtrace: #0 /usr/local/apache/common-local/php-1.22wmf2/includes/db/DatabaseMysql.php(205): mysql_fetch_object(Resource id #146) #1 /usr/local/apache/common-local/php-1.22wmf2/includes/db/Database.php(1546): DatabaseMysql->fetchObject(Object(ResultWrapper)) #2 /usr/local/apache/common-local/php-1.22wmf2/includes/filerepo/file/LocalFile.php(372): DatabaseBase->selectRow('image', Array, Array, 'LocalFile::load...') #3 /usr/local/apache/common-local/php-1.22wmf2/includes/filerepo/file/LocalFile.php(471): LocalFile->loadExtraFromDB() #4 /usr/local/apache/common-local/php-1.22wmf2/includes/filerepo/file/LocalFile.php(666): LocalFile->load(1) #5 /usr/local/apache/common-local/php-1.22wmf2/includes/media/DjVu.php(236): LocalFile->getMetadata() #6 /usr/local/apache/common-local/php-1.22wmf2/includes/media/DjVu.php(309): DjVuHandler->getMetaTree(Object(LocalFile)) #7 /usr/local/apache/common-local/php-1.22wmf2/includes/filerepo/file/LocalFile.php(613): DjVuHandler->getPageDimensions(Object(LocalFile), 1) #8 /usr/local/apache/common-local/php-1.22wmf2/includes/media/ImageHandler.php(36): LocalFile->getWidth() #9 /usr/local/apache/common-local/php-1.22wmf2/includes/filerepo/file/File.php(590): ImageHandler->canRender(Object(LocalFile)) #10 /usr/local/apache/common-local/php-1.22wmf2/includes/filerepo/file/File.php(865): File->canRender() #11 /usr/local/apache/common-local/php-1.22wmf2/includes/ImageGallery.php(300): File->transform(Array) #12 /usr/local/apache/common-local/php-1.22wmf2/includes/CategoryViewer.php(430): ImageGallery->toHTML() #13 /usr/local/apache/common-local/php-1.22wmf2/includes/CategoryViewer.php(109): CategoryViewer->getImageSection() #14 /usr/local/apache/common-local/php-1.22wmf2/includes/CategoryPage.php(110): CategoryViewer->getHTML() #15 /usr/local/apache/common-local/php-1.22wmf2/includes/CategoryPage.php(76): CategoryPage->closeShowCategory() #16 /usr/local/apache/common-local/php-1.22wmf2/includes/actions/ViewAction.php(44): CategoryPage->view() #17 /usr/local/apache/common-local/php-1.22wmf2/includes/Wiki.php(439): ViewAction->show() #18 /usr/local/apache/common-local/php-1.22wmf2/includes/Wiki.php(305): MediaWiki->performAction(Object(CategoryTreeCategoryPage), Object(Title)) #19 /usr/local/apache/common-local/php-1.22wmf2/includes/Wiki.php(565): MediaWiki->performRequest() #20 /usr/local/apache/common-local/php-1.22wmf2/includes/Wiki.php(458): MediaWiki->main() #21 /usr/local/apache/common-local/php-1.22wmf2/index.php(59): MediaWiki->run() #22 /usr/local/apache/common-local/w/index.php(3): require('/usr/local/apac...') #23 {main}
Do we know how wide-spread this issue is? Have we seen any mentions of it on the Commons VP or similar? (cc'ing Andre for his assistance there)
I have not seen any other mentions of this so far.
Should be much better after https://gerrit.wikimedia.org/r/#/c/63696/ and https://gerrit.wikimedia.org/r/#/c/63718/
djvu could possibly do another layer of caching. The category OOMs are due to thumbnails functions calling getWidth(), which triggers getDimensions() on the handler which loads the massive blob and builds a tree. In such cache would need to be keep the dimention info for each page in it.
Basically question from comment 1 still stands more now after the two merged patches in comment 3: Do we know how wide-spread this issue is nowadays? Is there a known testcase for this nowadays? For example http://commons.wikimedia.org/wiki/Category:1877_books is accessible and can be seen.
Closing, the the whole handler is still very inefficient.