Last modified: 2014-04-22 19:36:44 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T66095, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 64095 - "webservice stop" leaves blocking php-cgi processes behind
"webservice stop" leaves blocking php-cgi processes behind
Status: RESOLVED DUPLICATE of bug 63878
Product: Wikimedia Labs
Classification: Unclassified
tools (Other open bugs)
unspecified
All All
: High blocker
: ---
Assigned To: Marc A. Pelletier
http://tools.wmflabs.org/commonshelper/
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-04-18 16:57 UTC by Magnus Manske
Modified: 2014-04-22 19:36 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Magnus Manske 2014-04-18 16:57:27 UTC
The webservice for my "commonshelper" tool is running, but I can't load the web page(s). Examples:

http://tools.wmflabs.org/commonshelper/index.php (tool)
http://tools.wmflabs.org/commonshelper/index_test.php (simple test page)

The pages are just loading "forver".

* removed access.log as per https://bugzilla.wikimedia.org/show_bug.cgi?id=58931
* did webservice restart
* did webservice stop / webservice start

I got similar bug reports for multiple tools since yesterday, which were resolved with restarting the web service, but apparently not this one.
Comment 1 zhuyifei1999 2014-04-22 12:37:46 UTC
Related to http://lists.wikimedia.org/pipermail/labs-l/2014-April/002305.html ?
Comment 2 metatron 2014-04-22 14:00:13 UTC
I've seen this problem before. lighttpd webservice stops, but old php-cgi processes remain. $webservice start then starts /one/ lighhtpd process, but can't start new php-cgi's. So plain html or py is served just fine, while php requests are "stuck".

This is the output from webgrid for commonshelper:

tools-webgrid-01: (13:51:40)
  608 tools.co  20   0 48668 2116 1312 S    0  0.0   0:00.03 lighttpd
11144 tools.co  20   0  281m  11m 7748 S    0  0.1   0:00.03 php-cgi
11146 tools.co  20   0  288m  11m 4764 S    0  0.1   2:04.32 php-cgi
11147 tools.co  20   0  288m  11m 4680 S    0  0.1   0:36.61 php-cgi
11148 tools.co  20   0  288m  11m 4760 S    0  0.1   1:29.41 php-cgi
11149 tools.co  20   0  288m  11m 4756 S    0  0.1   2:42.74 php-cgi 

tools-webgrid-02: (13:51:40)
19567 tools.co  20   0  281m  11m 7764 S    0  0.1   0:00.01 php-cgi
19575 tools.co  20   0  283m 9844 4320 S    0  0.1   0:35.07 php-cgi
19576 tools.co  20   0  283m 9836 4312 S    0  0.1   0:01.24 php-cgi
19577 tools.co  20   0  283m 9912 4272 S    0  0.1   0:34.99 php-cgi
19578 tools.co  20   0  283m 9796 4272 S    0  0.1   0:35.88 php-cgi


I figured out this workaround. Make this a script & execute:

#!/bin/bash
webservice stop
sleep 5
ssh tools-webgrid-01 'pkill -9 -U tools.commonshelper php-cgi'
ssh tools-webgrid-02 'pkill -9 -U tools.commonshelper php-cgi'
sleep 5
webservice start
Comment 3 Tim Landscheidt 2014-04-22 14:38:47 UTC
metatron is correct; I recently had to purge some old processes (cf. [[wikitech:Nova Resource:Tools/SAL#April 10]]).

To fix Magnus' issue, I killed the blocking php-cgi processes; the tool should be working again.

The underlying problem is that "webservice stop" uses qdel which by default uses SIGKILL.  That kills the lighttpd process and its workers, but not the spawned php-cgi processes.

Testing shows that on SIGTERM lighttpd correctly ends its workers and the spawned php-cgi processes.

I recently filed bug #61102 to use SIGTERM for the general case of jsub; the same logic applies to this bug as well.
Comment 4 Magnus Manske 2014-04-22 19:09:36 UTC
Thanks Tim, metatron, it works again!
Comment 5 Tim Landscheidt 2014-04-22 19:17:23 UTC
It works for now :-), but the general problem hasn't been solved yet.
Comment 6 Tim Landscheidt 2014-04-22 19:36:44 UTC
Ha!  I knew I had jotted down something about the problem earlier.

*** This bug has been marked as a duplicate of bug 63878 ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links