Last modified: 2014-10-19 08:55:43 UTC
Intention: I was trying to submit a category.py move task on cs.wiki through jstart to gridengine through my hypobot tool at labs. As the script name says, I was trying to move a category on cs.wiki running pywikibots category.py from local goes through ok (dash for hyphen replacement in the name of the category) Steps to Reproduce: 1. jsub python /shared/pywikipedia/core/scripts/category.py move -from:"Hudební skupiny 1970-1979" -to:"Hudební skupiny 1970–1979" 2. cat python2.err 3. WARNING: Moving category page 'Kategorie:Hudební' requested, but the page doesn't exist. -happens with single quotation marks as well ('Hudební skupiny 1970-1979' instead of "Hudební skupiny 1970-1979") running pywikibots category.py from local goes through ok Actual Results: job did not get done Expected Results: the category to be moved Reproducible: Always Weird is that if I try to submit the job without spaces, for ex.: category.py move -from:Hudební_skupiny_1970-1979 -to:Hudební_skupiny_1970–1979 then it goes through as fas as getting the command right, but engine still gives an error "WARNING: Moving category page 'Kategorie:Hudební skupiny 1970-1979' requested, but the page doesn't exist.", which is weird, since the page exists and I can move it from local (but I dont want to since the category contains over 1000 pages and I would prefer it to run trhough labs than having my PC running all night long)
the problem with quotation marks appeared before (in May) when trying to submit replace.py task, but I gave up on it and did it locally
The problem is that gridengine will perform an arbitrary, and sometimes variable, number of shell substitutions over the commandline, making the use of quotes or escapes (in any combination) on the command line problematic at the best of times. This problem is fundamental to gridengine. Sometimes, double quoting or escaping internal quotes can solve the issue for a specific set of command line values, but this varies and is sometimes hard to predict (for instance, the presence of some shell metacharacter within the /quoted/ string causes gridengine to invoke an extra '/bin/sh -c' at the remote end, stripping one level of extra quoting). The only reliable ways to circumvent that issue are to either (a) create a small shell script that contains the final invocation with adequate quoting, and invoke /that/ through jsub/qsub instead, or (b) pass arguments to the job in some other manner than the command line (if pywikibot can accept arguments from a file, for instance, this could be used instead).
Try using: jsub python /shared/pywikipedia/core/scripts/category.py move -from:Hudební_skupiny_1970-1979 -to:Hudební_skupiny_1970–1979
@Marc, is this bug logged against gridengine ? If so, can you add the URL here?
(In reply to John Mark Vandenberg from comment #3) > Try using: > > jsub python /shared/pywikipedia/core/scripts/category.py move > -from:Hudební_skupiny_1970-1979 -to:Hudební_skupiny_1970–1979 Gives "WARNING: Moving category page 'Kategorie:Hudební skupiny 1970-1979' requested, but the page doesn't exist." Which is not true since if you copy that name of a page, and paste it to cs.wiki, it will get you to an existing page (https://cs.wikipedia.org/wiki/Kategorie:Hudebn%C3%AD_skupiny_1970-1979)
(In reply to John Mark Vandenberg from comment #3) > Try using: > > jsub python /shared/pywikipedia/core/scripts/category.py move > -from:Hudební_skupiny_1970-1979 -to:Hudební_skupiny_1970–1979 With -debug and -verbose I get thiss .err SITE VERSION: 1.25wmf2 MESSAGES: unknown (not logged in) === === === === === === === === === === === === === === Pywikibot rad0c47505aac26115b029acfa40908a3d5461c29 Python 2.7.3 (default, Feb 27 2014, 19:58:35) [GCC 4.6.3] Found 1 wikipedia:cs processes running, including this one. WARNING: Moving category page 'Kategorie:Hudební skupiny 1970-1979' requested, but the page doesn't exist. Moving category talk page 'Diskuse ke kategorii:Hudební skupiny 1970-1979' requested, but the page doesn't exist. Dropped throttle(s). Waiting for 1 network thread(s) to finish. Press ctrl-c to abort All threads finished.
OK, there are no quotes in that command line (which is just a more complete version of what was stated in the last paragraph of comment 0), so I assume that the reason for WONTFIXing in comment 2 is not relevant to this bug.
python /shared/pywikipedia/core/scripts/category.py move -from:Hudební_skupiny_1970-1979 -to:Hudební_skupiny_1970–1979 -simulate -verbose -debug will give in .err log this line: COMMAND: ['/shared/pywikipedia/core/scripts/category.py', 'move', '-family:wikipedia', '-lang:cs', '-from:Hudebn\xc3\xad_skupiny_1970-1979', '-to:Hudebn\xc3\xad_skupiny_1970\xe2\x80\x931979', '-simulate', '-debug', '-verbose']
Interesting! gridengine is also changing the unicode arguments, *and* _appears_ the be changing the output ?? Here is what I see when I run $ python pwb.py category.py 'move' '-family:wikipedia' '-lang:cs' '-from:Hudebn\xc3\xad_skupiny_1970-1979' '-to:Hudebn\xc3\xad_skupiny_1970\xe2\x80\x931979' '-simulate' '-debug' '-verbose' .... WARNING: Moving category page 'Kategorie:Hudebn\xc3\xad skupiny 1970-1979' requested, but the page doesn't exist. which is not the same as the reported output from jsub: WARNING: Moving category page 'Kategorie:Hudební skupiny 1970-1979' requested, but the page doesn't exist.
Ugh - ignore comment 9. I see the same log line from my workstation when I run $ python pwb.py category.py move -from:Hudební_skupiny_1970-1979 -to:Hudební_skupiny_1970–1979 -simulate -verbose -debug ... COMMAND: ['category.py', 'move', '-family:wikipedia', '-lang:cs', '-from:Hudebn\xc3\xad_skupiny_1970-1979', '-to:Hudebn\xc3\xad_skupiny_1970\xe2\x80\x931979', '-simulate', '-debug', '-verbose', '-log'] ... So, I still have no idea why it doesnt work on gridengine, as that command works elsewhere.
(In reply to Marc A. Pelletier from comment #2) > The problem is that gridengine will perform an arbitrary, and sometimes > variable, number of shell substitutions over the commandline, making the use > of quotes or escapes (in any combination) on the command line problematic at > the best of times. This problem is fundamental to gridengine. > > Sometimes, double quoting or escaping internal quotes can solve the issue > for a specific set of command line values, but this varies and is sometimes > hard to predict (for instance, the presence of some shell metacharacter > within the /quoted/ string causes gridengine to invoke an extra '/bin/sh -c' > at the remote end, stripping one level of extra quoting). > > The only reliable ways to circumvent that issue are to either (a) create a > small shell script that contains the final invocation with adequate quoting, > and invoke /that/ through jsub/qsub instead, or (b) pass arguments to the > job in some other manner than the command line (if pywikibot can accept > arguments from a file, for instance, this could be used instead). I tried to submit the command through script. I made an executable .sh file containing this string: python /shared/pywikipedia/core/scripts/category.py move -from:Hudební_skupiny_1970-1979 -to:Hudební_skupiny_1970–1979 -simulate -verbose -debug Submitted this script through jsub. The .err file after the script execution contained: ... COMMAND: ['/shared/pywikipedia/core/scripts/category.py', 'move', '-from:Hudebn\xc3\xad_skupiny_1970-1979', '-to:Hudebn\xc3\xad_skupiny_1970\xe2\x80\x931979', '-simulate', '-verbose', '-debug'] DATE: 2014-10-19 08:35:52.070622 UTC ... Moving category page 'Kategorie:Hudební skupiny 1970-1979' requested, but the page doesn't exist. Moving category talk page 'Diskuse ke kategorii:Hudební skupiny 1970-1979' requested, but the page doesn't exist. Any further ideas or suggestions?