Last modified: 2014-09-27 02:01:56 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T71428, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 69428 - Jobrunner will fail to spawn jobs on HHVM
Jobrunner will fail to spawn jobs on HHVM
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
JobRunner (Other open bugs)
unspecified
All All
: High blocker (vote)
: ---
Assigned To: Nobody - You can work on this!
: hhvm
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-08-12 09:34 UTC by Giuseppe Lavagetto
Modified: 2014-09-27 02:01 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
simple script that reproduces the issue in production. (380 bytes, application/x-php)
2014-08-12 09:34 UTC, Giuseppe Lavagetto
Details

Description Giuseppe Lavagetto 2014-08-12 09:34:56 UTC
Created attachment 16174 [details]
simple script that reproduces the issue in production.

When running on HHVM, the jobrunner service (configured to use fcgi I suppose) fails to spawn curl requests with the following errors:


[Tue Aug 12 09:25:36 2014] [hphp] [12782:7f0b86813700:0:033896] [] 
Warning: fork failed - Cannot allocate memory in /srv/deployment/jobrunner/jobrunner/redisJobRunnerService on line 933
[Tue Aug 12 09:25:36 2014] [hphp] [12782:7f0b86813700:0:033897] [] 
Notice: Undefined index: 1 in /srv/deployment/jobrunner/jobrunner/redisJobRunnerService on line 935
[Tue Aug 12 09:25:36 2014] [hphp] [12782:7f0b86813700:0:033898] [] 
Notice: Undefined index: 2 in /srv/deployment/jobrunner/jobrunner/redisJobRunnerService on line 936
[Tue Aug 12 09:25:36 2014] [hphp] [12782:7f0b86813700:0:033899] [] 
Notice: Undefined index: 0 in /srv/deployment/jobrunner/jobrunner/redisJobRunnerService on line 938
[Tue Aug 12 09:25:36 2014] [hphp] [12782:7f0b86813700:0:033900] [] 
Warning: Not a valid stream resource in /srv/deployment/jobrunner/jobrunner/redisJobRunnerService on line 938
2014-08-12T09:25:36+0000: Could not spawn process in loop 0: curl -XPOST -s -a 'http://127.0.0.1:9002/rpc/RunJobs.php?wiki=glkwiki&type=ChangeNotification&maxtime=60&maxmem=300M'

I tried various tweaks (like raising the memory limit both in the JR script and in hhvm) but nothing seemed to work around this.

This seems to be a general problem with hhvm as configured by us btw, I wrote a small script that just forks with proc_open a curl request for enwiki main page, and it spawns the same error (see attachment).
Comment 1 Giuseppe Lavagetto 2014-08-12 09:36:54 UTC
This happens with our packages, of course.
Comment 2 Aaron Schulz 2014-08-13 21:01:23 UTC
Not seeing this with:

sudo -u apache /usr/bin/php /srv/deployment/jobrunner/jobrunner/redisJobRunnerService --config-file=/etc/jobrunner/jobrunner.conf --verbose

Also, running some of the curl commands it does gives normal, expected, JSON replies.
Comment 3 Filippo Giunchedi 2014-08-15 08:59:00 UTC
22:40  <godog> btw from the issue above there we got the core dumped on mw1053:/tmp via the usual script
22:42  <godog> it looks like this too http://ganglia.wikimedia.org/latest/?r=day&cs=8%2F14%2F2014+5%3A41&ce=8%2F14%2F2014+21%3A13&c=Jobrunners+eqiad&h=mw1053.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=medium&metric_group=ALLGROUPS

if we don't get any specific clue from the core file on what was going on we could try and disable some job types and see what that does

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links