Last modified: 2014-10-21 16:34:21 UTC
We had a problem where the WMES wiki was painfully slow. The access log showing thousands of Special%3ARunJobs entries (NOT one per request). $ grep -c Special%3ARunJobs access.log.2014-06-03 63116 $ grep -c -v Special%3ARunJobs access.log.2014-06-03 472 (that would be 133 RunJobs per request‼) Changing $wgJobRunRate to 0 from the default of 1 “fixed” the slowness. From the external appearance, it looks like each Special:RunJobs then called another Special:RunJobs with an infinite recursion until the time limit. There are 4 jobs with old dates in the queue, but even if they are not completable, they shouldn't have produced so many POSTs. At the point $wgJobRunRate was disabled, the wiki was running 00f91912c64 (master), but it had been ‘broken’ for weeks.
Maybe the jobs where really slow and the run rate of 1 kept letting them build up. They could really some sort of bucketing scheme to set a configurable limit on the number of concurrent requests somehow...
Same here with v1.23.0: for every request, the following gets logged into access.log: ::ffff:11.22.33.44 - - [18/Jun/2014:09:58:14 +0200] "POST /mediawiki/index.php?title=Special%3ARunJobs&tasks=jobs&maxjobs=1&sigexpiry=1403078299&signature=6da0cc79966609c2b51937b9175d7728b290ff0a HTTP/1.1" 411 357 "-" "-" 192.168.0.10 www.example.org - [18/Jun/2014:09:58:23 +0200] "GET /wiki/Main_Page HTTP/1.1" 200 18203 "-" "curl/7.37.0" Setting wgJobRunRate to "0" seems to "fix" this. I don't understand what "jobs" are now not being run, but if it has any negative impact I shall change it to "0.01" or something to lessen the POST requests.