Last modified: 2014-03-14 10:38:50 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T53935, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 51935 - wm-bot to tools
wm-bot to tools
Status: RESOLVED WONTFIX
Product: Wikimedia Labs
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Low normal
: ---
Assigned To: Peter Bena
:
Depends on: 51936 51937 51940 51943 51965 51966
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-24 08:44 UTC by Peter Bena
Modified: 2014-03-14 10:38 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Peter Bena 2013-07-24 08:44:34 UTC
I need to move it to tools :o
Comment 1 Peter Bena 2014-03-07 20:01:35 UTC
Summary of why wm-bot isn't going to be moved so far:

2 blocker bugs that aren't going to be fixed:

https://bugzilla.wikimedia.org/show_bug.cgi?id=51943 - not being fixed result in wm-bot being randomly killed as linux kernel allocates for it non-sense vmem (this happens for unknown reasons, I need to talk to some kernel guru's in order to understand why it happens). Until this is fixed the wm-bot would be very unstable

https://bugzilla.wikimedia.org/show_bug.cgi?id=51936 - no query relaying means that very useful module NetCat wouldn't work. This would decrease the bot functionalities significantly and would be very unfortunate. The bug is not going to be fixed which means the wm-bot would have only limited functionalities within tools project


In addition there is a number of complex issues that would need to be changed in core only and only in order to make it work within tools project environment (these are rather useless patches that wouldn't need to exist in any other environment). Most of these issues require dozen of classess being rewritten a lot and lot of developer work which result only in simple optimization for tools project, so the end user of wm-bot wouldn't even see any difference (from my point of view useless hard work that produces no fruit).

Wm-bot already works on separate instance, utilize it very well, and react very bad if the instance is shared / or running any other processes. The instance can be very small and wm-bot naturally requires little CPU and operating memory (vmem is some nonsense calculated by kernel to which SGEN on tools is bound, so even if WM-Bot itself run perfectly with 500 MB of ram, it would die OOM even if it allocated 2GB of ram on SGEN box).

We are already using separate projects / instances on wikimedia project for from my point of view "useless non-sense" like some super-huge dumps of 3rd wiki's, empty instances called just "bob" which nobody knows what they are for and some under optimized bots or tools that consume 30000 times more resources than they would need if they were written properly. For this reason I see no reason why wm-bot couldn't have own super small instance where it happily lives, instead of being migrated to complex grid such as tool labs which is more than unsuitable for a bot like this.

On other hand I can see a number of reasons why it SHOULD run on separate instance. One of them is simply, that it would need less resources. As I already mentioned wm-bot can happily live with minimum RAM, because of SGEN limitations, it would however need to request at least 2gb or more of VMEM for it to work, which is significantly more than it needs and a huge waste. Given the architecture of bot, being able to access the instance where it lives is very helpful (not possible on tools) as well as being able to setup multiple separate filesystems for different components of bot (for IO optimizations) not possible on labs as well.

In nutshell: running wm-bot on tools grid is as easy as running oracle or postgre rdbms on tools grid.
Comment 2 Peter Bena 2014-03-07 20:06:28 UTC
btw that doesn't mean wm-bot is rdbms itself, but it consist of many separate components that interact with each other (such as delayed IO writer, telnet listener, queues, caches and buffers and shared memory pools) that are not easy to spread across multiple servers.
Comment 3 Peter Bena 2014-03-14 09:42:01 UTC
In addition I started using btrfs snaphots to clone the logs / databases using COW for user backups as well as system backups and in order to generate log tarballs (generating a log tarball may take several minutes during which the bot is writing to these log files, this caused random issues with tar).

There is no btrfs on tools and even if there was, this requires root
Comment 4 Andre Klapper 2014-03-14 10:38:50 UTC
Petr: Thanks for your explanation here. Appreciated!

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links