Last modified: 2014-05-20 18:01:03 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T66988, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 64988 - jsub not installed on the queue machines
jsub not installed on the queue machines
Status: RESOLVED FIXED
Product: Wikimedia Labs
Classification: Unclassified
tools (Other open bugs)
unspecified
All All
: Unprioritized critical
: ---
Assigned To: Marc A. Pelletier
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-05-07 06:01 UTC by bgwhite
Modified: 2014-05-20 18:01 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description bgwhite 2014-05-07 06:01:30 UTC
On queue machines: Can't exec "/usr/bin/jsub": No such file or directory.
                               Can't exec "/usr/bin/qsub": No such file or directory.

With the new cron, I have to send programs to the queue.  Programs determine if a dump file for a particular wiki is available or runs daily scans of particular wikis.  If dump file is available or daily needs to be run, these programs send the checkwiki scans to the queue for the appropriate wiki language.  Only other option is to write a 100+ line crontab.
Comment 1 Johan 2014-05-07 19:27:50 UTC
This affects me to. I run python script in crontab that constructs and submits SGE array jobs. That's no longer possible. Did some research and it seems the exec instances are no longer permitted access the SGE master:

error: denied: host "tools-exec-05.eqiad.wmflabs" is neither submit nor admin host
Comment 2 Tim Landscheidt 2014-05-08 00:30:02 UTC
The exec instances are not "no longer" permitted to submit jobs, but never were :-).  But until the crontab change this wasn't relevant.
Comment 3 Johan 2014-05-08 10:13:48 UTC
Is there a reason for this? I use a python script that constructs a SGE array job. With this new rule the python script must run on the exec instances and won't be able to submit the array job. Before the python script ran on tools-dev.
Comment 4 bgwhite 2014-05-08 20:15:45 UTC
(In reply to Johan from comment #3)
> Is there a reason for this? I use a python script that constructs a SGE
> array job. With this new rule the python script must run on the exec
> instances and won't be able to submit the array job. Before the python
> script ran on tools-dev.

Johan, while you and I were doing simple scripts to send jobs to the queue, others were not.  They were doing bigger jobs and the amount of these bigger jobs caused the cron machine to slow down or not function at times.  Before the cron switch, around ~15% of my cron jobs never ran.  So, having cron jobs sent to the queue is a good thing.™  Having jobs on the queue not being able to submit jobs is a different story.  Currently, it is penalizing those of us who doing the right thing before the switch.  This problem has also been known for a week now.
Comment 5 Marc A. Pelletier 2014-05-10 07:59:34 UTC
Originally, it was not intended for jobs to be able to spawn jobs (the potential for a runaway "fork bomb" that would severely impact other users was too great).

Rather than turn that feature on, which has its own set of problems, I've made a workaround available: rather than use jsub in a cron job, you can use 'jlocal' which explicitly uses a local shell (and from which, therefore, you can run a shell script that does submit jobs).

I'm going to document it shortly, but it's already available for use.
Comment 6 bgwhite 2014-05-20 00:29:39 UTC
(In reply to Marc A. Pelletier from comment #5)

Any status update on documentation?  I can't use 'jlocal' as I keep getting:
   "/usr/bin/jlocal": No such file or directory
Comment 7 Betacommand 2014-05-20 00:33:03 UTC
Where are you using it? on your crontab or your submitted jobs?
Comment 8 bgwhite 2014-05-20 00:58:49 UTC
On queue machines and tools-login.  On crontab and submitted jobs.
Comment 9 Betacommand 2014-05-20 01:01:48 UTC
thats your problem, jlocal should only exist on the -submit host. you basically have a wrapper script you invoke that submits other jobs. if you use jlocal in your crontab to start that script, it should be able to then submit jobs to the grid
Comment 10 bgwhite 2014-05-20 01:06:39 UTC
That still doesn't solve my original question.  jlocal is not found anywhere.  I can't use it on any host as it is not found.  I can't use it on the submit host as it is not  found.  Where is jlocal?
Comment 11 Betacommand 2014-05-20 01:08:34 UTC
jlocal is on the submit host, I used it daily and just got a email from a cron using it ~2 minutes ago
Comment 12 bgwhite 2014-05-20 17:32:35 UTC
(In reply to Betacommand from comment #11)
> jlocal is on the submit host, I used it daily and just got a email from a
> cron using it ~2 minutes ago

Could you provide an example of how you are doing this.  Then, my brain might have an 'ah ha' moment (doubtful).
Comment 13 Betacommand 2014-05-20 18:01:03 UTC
0 1 * * * jlocal python /data/project/betacommand-dev/svn_copy/email_logs.py

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links