Last modified: 2014-05-29 17:23:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T49269, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 47269 - Jobs idling in production, no way to kill them or relaunch them
Jobs idling in production, no way to kill them or relaunch them
Status: RESOLVED WONTFIX
Product: Analytics
Classification: Unclassified
Wikimetrics (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-16 00:55 UTC by Dario Taraborelli
Modified: 2014-05-29 17:23 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Dario Taraborelli 2013-04-16 00:55:20 UTC
Kirsten reports that some jobs she started in production with very small cohorts (like test2) are idling after hours. There's currently no way to kill them or restart them from the UI as they are still marked as pending.

I can't check what's going wrong with these requests on stat1001, but a query like this one, when run on the dev instance, takes less than 10 seconds to complete:

http://127.0.0.1:4000/cohorts/test2/threshold?project=enwiki&t=154&n=1&group=REGISTRATION&refresh

See also: https://bugzilla.wikimedia.org/show_bug.cgi?id=47236
Comment 1 Dario Taraborelli 2013-04-16 01:14:14 UTC
Possibly related to: https://github.com/rfaulkner/E3_analysis/issues/86
Comment 2 Dario Taraborelli 2013-04-16 17:08:07 UTC
Confirmed: the prod instance currently doesn't accept new requests, all new jobs (including very short ones) are stuck in the queue.
Comment 3 Dario Taraborelli 2013-04-16 17:43:34 UTC
FYI: Andrew restarted the server and new requests are now successfully completing, but we still need to find the causes of this behavior that doesn't seem to affect the dev instance on stat1 or local instances.
Comment 4 Ori Livneh 2013-06-10 03:25:11 UTC
Lowering priority.
Comment 6 Andre Klapper 2014-05-29 17:23:20 UTC
[moving tickets as per bug 65903]

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links