Last modified: 2014-02-17 19:40:50 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T60692, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 58692 - GWToolset Jobs are not properly picked-up by runJobsLoopService
GWToolset Jobs are not properly picked-up by runJobsLoopService
Status: RESOLVED FIXED
Product: Wikimedia Labs
Classification: Unclassified
Infrastructure (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Nobody - You can work on this!
http://commons.wikimedia.beta.wmflabs...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-12-19 16:26 UTC by dan
Modified: 2014-02-17 19:40 UTC (History)
6 users (show)

See Also:
Web browser: Google Chrome
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
rijks-10-items (17.31 KB, application/xml)
2013-12-19 22:53 UTC, dan
Details

Description dan 2013-12-19 16:26:08 UTC
expected behavior
-----------------
when a gwtoolset batch job is created the runner should pick it up and run the gwtoolset* jobs.

actual behavior
---------------
• the MetadataJob goes into the queue.
• the MetadataJob is never picked up
• if another MetadatJob is added before approximately 5 minutes that one is also added to the queue, but not picked up
• if another MetadataJob is added to the queue after approximately 5 minutes, the MetadataJobs waiting in the queue are picked up, run and their MediaFileJobs and CleanUp jobs are run. the new MetadataJob waits in the queue and is not picked up until yet another gwtoolset MetadataJob is created.

additional research by hashar
-----------------------------
the beta cluster shows gwtoolset* jobs pending:
 $ mwscript showJobs.php --wiki=commonswiki --group
 cirrusSearchUpdatePages: 0 queued; 7 claimed (0 active, 7 abandoned)
 gwtoolsetGWTFileBackendCleanupJob: 0 queued; 1 claimed (1 active, 0 abandoned)
 gwtoolsetUploadMediafileJob: 0 queued; 5 claimed (0 active, 5 abandoned)
 gwtoolsetUploadMetadataJob: 1 queued; 0 claimed (0 active, 0 abandoned)
 webVideoTranscode: 4 queued; 0 claimed (0 active, 0 abandoned)
 $

in this case gwtoolsetUploadMetadataJob, however, nextJobDB.php does not find it.

it looks like this is because nextJobDB.php calls JobQueueAggregator::singleton()->getAllReadyWikiQueues(), which does not list gwtoolset jobs.
Comment 1 Gerrit Notification Bot 2013-12-19 16:34:54 UTC
Change 102675 had a related patch set uploaded by Dan-nl:
gwtoolset-runners

https://gerrit.wikimedia.org/r/102675
Comment 2 Gerrit Notification Bot 2013-12-19 18:46:05 UTC
Change 102675 abandoned by Dan-nl:
gwtoolset-runners

Reason:
per aaron’s confirmation, having the jobs listed in one runJobsLoopService statement is fine. we still need to resolve the issue of the job runner not picking up the gwtoolset jobs.

https://gerrit.wikimedia.org/r/102675
Comment 3 Antoine "hashar" Musso (WMF) 2013-12-19 21:25:54 UTC
Following a conversation with Aaron, that is apparently fixed by https://gerrit.wikimedia.org/r/#/c/102749/ Make executeReadyPeriodicTasks() notify the aggregator when jobs are released/recycled
Comment 4 Aaron Schulz 2013-12-19 21:35:40 UTC
(In reply to comment #3)
> Following a conversation with Aaron, that is apparently fixed by
> https://gerrit.wikimedia.org/r/#/c/102749/ Make executeReadyPeriodicTasks()
> notify the aggregator when jobs are released/recycled

I couldn't really reproduce problems like this report. That was just some related fix of something I noticed while looking at this.
Comment 5 dan 2013-12-19 22:53:33 UTC
Created attachment 14144 [details]
rijks-10-items

ssh-wikilabs
------------
1. ssh deployment-bastion.pmtpa.wmflabs
2. tail -f /data/project/logs/runJobs.log | grep 'gwtoolset'

gwtoolset form
--------------
1. go to http://commons.wikimedia.beta.wmflabs.org/w/index.php?title=Special:GWToolset&gwtoolset-form=metadata-detect
2. select mediawiki template artwork
3. place, Metadata Mappings/Dan-nl/Rijksmuseum.json, in the metadata mapping field.
4. use this attached xml file for the metadata file upload.
5. click the submit button
6. add a summary if you wish
7. click preview batch
8. click process batch
Comment 6 Gerrit Notification Bot 2013-12-20 16:02:17 UTC
Change 102955 had a related patch set uploaded by Dan-nl:
removing initial delay

https://gerrit.wikimedia.org/r/102955
Comment 7 Gerrit Notification Bot 2013-12-20 16:04:41 UTC
Change 102955 merged by jenkins-bot:
removing initial delay

https://gerrit.wikimedia.org/r/102955
Comment 8 dan 2013-12-20 16:51:44 UTC
• at Fri, 20 Dec 2013 10:31:18 GMT i set-up a gwtoolset bath job. it never ran.

• at 2013-12-20 16:04:41 we uploaded a test patch to see if removing the
  initial MetadataJob delay would help.
  
• at 2013-12-20 16:09:38 added another gwtoolset batch job. that one kicked
  off the earlier batch job plus itself.
  
• at 2013-12-20 16:16:04 added another gwtoolset batch job that had a throttle
  of 3 so that it would be forced to create another MetadataJob with a delay.
  the initial MetadataJob ran and ran the first 3 MediaFileJobs.
  
  there is another MetadataJob in the queue that can now be seen with 
  mwscript showJobs.php --wiki=commonswiki --group, but it's never picked up.
  
  so the question now is why does the puppet runner script not pick up the
  delayed job?
Comment 9 Antoine "hashar" Musso (WMF) 2013-12-20 20:04:57 UTC
Could it be that JobQueueAggregator::singleton()->getAllReadyWikiQueues() used by nextJobDB.php does not returns delayed jobs?
Comment 10 dan 2013-12-31 15:34:03 UTC
https://gerrit.wikimedia.org/r/#/c/103524/ and possibly other commits
  seem to have resolved the job runner issue; i was able to successfully
  test a small dataset on beta.
• will wait for david haskiya to try a test dataset on production before
  closing this bug.
Comment 11 Andre Klapper 2014-02-17 19:30:13 UTC
(In reply to dan from comment #10)
> • will wait for david haskiya to try a test dataset on production before
>   closing this bug.

dan: Any way to follow up on this?
Comment 12 dan 2014-02-17 19:40:50 UTC
• david was able to run an initial upload on productions, so i think 
  can consider this ticket closed.

  http://commons.wikimedia.org/wiki/Category:GWToolset_Batch_Upload

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links