Last modified: 2012-11-27 09:08:52 UTC
We have upgraded gallium to Ubuntu Precise (bug 41053). The following week-end I noticed some jobs were failing because of PHP time out or just being slow. On this wednesday we had ton of jobs pilling up in the build queue which rarely happened. From a quick script: 2012-10-25_14-42-58 took 131.598 s. 2012-10-25_14-47-54 took 137.581 s. 2012-10-25_15-31-04 took 139.274 s. 2012-10-25_15-34-04 took 151.464 s. ## Precise upgrade there. 2012-10-25_17-35-08 took 253.544 s. 2012-10-25_17-39-22 took 1.444 s. 2012-10-25_17-39-23 took 1 s. 2012-10-25_17-39-24 took 4.674 s. 2012-10-25_17-39-29 took 1.304 s. 2012-10-25_17-47-13 took 229.234 s. 2012-10-25_17-51-02 took 239.811 s. 2012-10-25_17-55-02 took 225.362 s. 2012-10-25_17-58-48 took 242.351 s.
Output is from the tools/jdurationreport.php script in integration/jenkins.git https://gerrit.wikimedia.org/r/#/c/31139/
Created attachment 11273 [details] Ganglia graphic showing a raise of CPU waiting I/O after Precise upgrade
Looking at the PHPUnit groups: Parser tests went from 68s to 88s Database less tests seems to be constant from 13,8 to 14,1 Dumps from 29s to 63s Parser and Dumps tests do a lot of I/O.
One can reproduce running the Dump tests using a MediaWiki snapshot: wget https://integration.mediawiki.org/nightly/mediawiki/core/mediawiki-78a5729.zip unzip mediawiki-78a5729.zip cd mediawiki-78a5729 # Install a basic database using sqlite as a backend php maintenance/install.php benchwiki sysop --pass secret --dbtype sqlite --dbpath . # Run the Dump test suite time php tests/phpunit/phpunit.php --group Dump
Jenkins keeps track of builds duration. For the Dumps the dashboard is at https://integration.mediawiki.org/ci/job/MediaWiki-Tests-Dumps/buildTimeTrend
*** Bug 41657 has been marked as a duplicate of this bug. ***
Raising priority per discussion with Rob. That makes tests to fail randomly.
I'm not sure exactly why it got worse in Precise, but moving the temporary files such as SQLite data files to a temporary filesystem (tmpfs) would be a simple fix for the problem. They are only 3.7 MB: root@gallium:/var/lib/jenkins/jobs# find -maxdepth 3 -name data -type d | xargs du -csh 4.0K ./MediaWiki-Tests-Databaseless/workspace/data 256K ./MediaWiki-Tests-Extensions/workspace/data 244K ./Ext-MobileFrontend/workspace/data 228K ./_shared/workspace/data 224K ./MediaWiki-analysis/workspace/data 276K ./Ext-TranslationNotifications/workspace/data 252K ./MediaWiki-Tests-API/workspace/data 228K ./Ext-WebFonts/workspace/data 292K ./Ext-Translate/workspace/data 272K ./MediaWiki-Tests-Misc/workspace/data 256K ./MediaWiki-Tests-Dumps/workspace/data 248K ./Ext-TitleBlacklist/workspace/data 228K ./Ext-Narayam/workspace/data 236K ./Ext-Wikibase-old/workspace/data 260K ./MediaWiki-Tests-Parser/workspace/data 244K ./Ext-UniversalLanguageSelector/workspace/data 4.0K ./MediaWiki-CheckStyle/workspace/data 3.7M total I can set up symlinks.
Done, and build 6515 took only 33 seconds, so I guess it is fixed. https://integration.mediawiki.org/ci/view/All-enabled/job/MediaWiki-Tests-Dumps/6515/
Just for the record, the numbers that hashar used in comment #0 are available on: https://integration.mediawiki.org/ci/job/MediaWiki-GIT-Fetching/buildTimeTrend Which want back from 4 min 13 sec (253 s.) to 2 min 24 sec (144 s.)
The tmpfs hack is being normalized in puppet with https://gerrit.wikimedia.org/r/#/c/35159/