Last modified: 2013-09-11 17:38:06 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T38835, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 36835 - job queue monitoring looks for 1.18 dir and fails / $wmfExtendedVersionNumber.php
job queue monitoring looks for 1.18 dir and fails / $wmfExtendedVersionNumbe...
Status: UNCONFIRMED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Low normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-14 12:50 UTC by Daniel Zahn
Modified: 2013-09-11 17:38 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Daniel Zahn 2012-05-14 12:50:54 UTC
currently all the "check_job_queue" checks on Nagios fail with:

	JOBQUEUE CRITICAL - check plugin (check_job_queue) or PHP errors - 

investigating this i saw the problem does not appear to be in "check_job_queue" itself, but rather in CommonSettings.php , as check_job_queue misses this:

PHP Warning:  require(/home/wikipedia/common/php-1.18/../wmf-config/ExtensionMessages-1.18.php): failed to open stream: No such file or directory in /home/wikipedia/common/wmf-config/CommonSettings.php on line 2506

and ..

PHP Fatal error:  require(): Failed opening required '/home/wikipedia/common/php-1.18/../wmf-config/ExtensionMessages-1.18.php' (include_path='/home/wikipedia/common/php-1.20wmf2/extensions/OggHandler/PEAR/File_Ogg:/home/wikipedia/common/php-1.18:/home/wikipedia/common/php-1.18/lib:/usr/local/lib/php:/usr/share/php') in /home/wikipedia/common/wmf-config/CommonSettings.php on line 2506

that line 2506 in CommonSettings.php is:

require( "$wmfConfigDir/ExtensionMessages-$wmfExtendedVersionNumber.php" );

so it is looking in /php-1.18/  because $wmfExtendedVersionNumber.php is set to that, and that setting seems outdated.  

Where should it be fixed?
Comment 1 Daniel Zahn 2012-05-14 13:59:59 UTC
15:37 < jeremyb> do you have anything in /home/wikipedia/common/wikiversion* 
                 ?
15:37 < mutante> where should it get the info from?
15:38 < mutante> yea, wikiversion.data
15:38 < mutante> .dat
15:38 < mutante>  2012-05-09
15:39 < mutante> the string "18" does not appear in the file 
15:40 < mutante> and wikiversions.cdb , modified 05-10
15:41 < jeremyb> so, strace and find out which wikiversions file it's using? 
                 or if it's using one at all?

15:41 < jeremyb> 1.18 was once hardcoded into CommonSettings.php as a 
                 fallback. but not in the current cluster version so I'm 
                 looking elsewhere
15:43 < mutante> open("/usr/local/apache/common-local/wikiversions.cdb", 
                 O_RDONLY) = 3
15:43 < jeremyb> there you go

15:44 < jeremyb> does that (or it's .dat) have 1.18?
15:45 < mutante> yes
15:45 < mutante> so "getMWVersion" should be changed to use /home ?
15:46 < mutante> or add mechanism to copy to /usr/local
15:46 < jeremyb> or -local should be made to be reliably up to date

13:48 mutante: copying outdated wikiversions.dat/.cdb files from /home to /usr/local on spence, which fixes check_job_queue (thanks jeremyb) 


./check_job_queue JOBQUEUE OK - all job queues below 10,000
Comment 2 Sam Reed (reedy) 2012-05-15 02:53:21 UTC
You should probably use the /usr/local/apache/common/php/maintenance/showJobs.php in some way or another.


We could just push all the MW files to spence...
Comment 3 Andre Klapper 2012-10-26 21:55:40 UTC
Daniel: Is this still an issue, or can this be closed as obsolete?
Comment 4 Andre Klapper 2013-03-27 10:52:57 UTC
Daniel: Is this still an issue, or can this be closed as obsolete?
Comment 5 Daniel Zahn 2013-03-27 23:57:56 UTC
what i wrote in 2012 is not an issue anymore. since then we switched to a single job_queue check. Which was ok at some point but maybe it is not ok again, because:

Current Status:	
  OK  
 (for 72d 17h 59m 48s)
Status Information:	Could not open input file: /home/wikipedia/common/multiversion/MWScript.php
JOBQUEUE OK - all job queues below 10,000


https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=neon&service=check_job_queue
Comment 6 Daniel Zahn 2013-03-28 00:03:20 UTC
#!/bin/bash
# nagios plugin to check the mediawiki job queue

LARGEQUEUES=
while read wiki count
do
        if [ ! $(echo "$count" | grep -E "^[0-9]+$") ]; then
                echo "JOBQUEUE CRITICAL - check plugin (`basename $0`) or PHP errors - $wiki"
                exit 2
        elif [ $count -gt 9999 ]; then
                LARGEQUEUES="$LARGEQUEUES, $wiki ($count)"
        fi
# The line below is a bash-ism that's needed for the LARGEQUEUES variable above to be in the right scope
# If you do php ... | while read wiki count; do LARGEQUEUE=blah; done , then the LARGEQUEUE variable will
# be manipulated in a subshell and the changes won't be visible to the if check below
done < <( php /home/wikipedia/common/multiversion/MWScript.php extensions/WikimediaMaintenance/getJobQueueLengths.php )
if [ -z "$LARGEQUEUES" ]; then
        echo "JOBQUEUE OK - all job queues below 10,000"
        exit 0
else
        echo "JOBQUEUE CRITICAL - the following wikis have more than 9,999 jobs: $LARGEQUEUES"
        exit 2
fi
Comment 7 Daniel Zahn 2013-03-28 00:05:51 UTC
root@neon:/usr/lib/nagios/plugins# ./check_job_queue 
Could not open input file: /home/wikipedia/common/multiversion/MWScript.php
JOBQUEUE OK - all job queues below 10,000

root@neon:~# cd /h/w/
-bash: cd: /h/w/: No such file or directory

of course, neon does not have /h/w.   spence did. this could never work if it relies on that
Comment 8 Aaron Schulz 2013-06-25 21:22:37 UTC
Is this still a problem?

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links