Last modified: 2014-08-06 23:57:06 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T55800, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 53800 - mediawiki & mediawiki/job cgroups creation should be moved to limit.sh
mediawiki & mediawiki/job cgroups creation should be moved to limit.sh
Status: REOPENED
Product: MediaWiki
Classification: Unclassified
File management (Other open bugs)
1.22.0
All All
: Low normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-05 14:24 UTC by Faidon Liambotis
Modified: 2014-08-06 23:57 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Faidon Liambotis 2013-09-05 14:24:58 UTC
We currently contain image scaling jobs into cgroups. We have an upstart script in puppet (/modules/mediawiki/files/cgroup/mw-cgroup.conf) that basically does:

pre-start script
    mkdir -p /sys/fs/cgroup/memory/mediawiki
    mkdir -m 0777 /sys/fs/cgroup/memory/mediawiki/job
    echo "/usr/local/bin/cgroup-mediawiki-clean" > /sys/fs/cgroup/memory/release_agent
end script

When cgroup-bin gets reconfigured e.g. during an upgrade, the cgroups go away (that looks like a bug of its own?) and the upstart job "mw-cgroup" is never re-run again, since it was already in the "started" upstart state.

In the meantime, thumbnailing jobs fail since they can't create their own job cgroup as the parent hierarchy (mediawiki/job) doesn't exist.

Although we could do all kinds of upstart tricks (stop on cgconfig stop for example), I can't see a reason on why limit.sh can't check for the existence of mediawiki & mediawiki/job and if they don't exist, create them itself.

This would nicely solve this and it'd be far more resilient.

Note that the above issue produced a complete thumbnail outage for the past hour or so and it is bound to happen again on the next cgroup-bin upgrade.
Comment 1 Jan Gerber 2013-09-06 06:37:48 UTC
one reason this is an upstart script is that its run as root.
can you also restart it on the videoscalers, they are also out.
Comment 2 Gerrit Notification Bot 2013-09-06 06:50:23 UTC
Change 83067 had a related patch set uploaded by J:
restart mw-cgroup on cgconfig restart

https://gerrit.wikimedia.org/r/83067
Comment 3 Gerrit Notification Bot 2013-09-06 13:36:55 UTC
Change 83067 merged by Faidon Liambotis:
restart mw-cgroup on cgconfig restart

https://gerrit.wikimedia.org/r/83067
Comment 4 Faidon Liambotis 2013-09-07 00:03:49 UTC
Sure, I guess this works too.
Comment 5 Tim Starling 2013-09-10 01:58:39 UTC
Maybe it would be better to use cgconfig.conf for this? It's better to use the standard configuration system than to start a wheel war with it, right?
Comment 6 Jan Gerber 2013-09-10 09:35:15 UTC
cgconfig.conf is not flexible enough to accommodate the current setup,
if its possible to rework the cgroups use to fit within the options cgconfig.conf allows,
moving it to that would be an option.

cgconfig.conf can be used to limit the overall resources for a group,
but not to have per process limits within a group.
afaik it also does not provide an option to install a release agent.

given those limitations having an upstart job that sets up the cgroups, as we do now, seams to be the best option. with the merged change, restarts are also no longer a problem.
not sure i would call that a wheel war - its more that there was a bug in mw-cgroup.conf that got fixed.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links