Last modified: 2014-10-27 04:02:50 UTC
Email I sent to Yuvi: tl;dr: grrrit-wm is broken right now, it's not relying messages, and when we try and attempt to kill it with qdel, it goes into some error state and keeps connecting/quitting. huh (PiRSquared) thinks he broke it when quieting it in another channel, it was down for a while before I came online. He asked if he could upgrade the irc node module it was using, so I said sure since he already had access...that didn't work, so he reverted it, but it's still not working. I logged into gerrit-to-redis, and based on the log messages it appears to be working fine (I also restarted it just in case, and it was working fine after the restart) So yeah...halp? <http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-dev/20141026.txt> is our log trying to figure stuff out. --- Second email: To stop spam, we moved the config.yaml to "config_moved-cuz-bug" so it would stop joining. It also finally died from qstat. At this point I'm not going to touch it until you or someone else has time to debug it...
I reverted the config.yaml move and the bot is now connected to freenode and seems stable. The stream seems broken, though.
So there were two issues currently: 1) gerrit-to-redis was not running. From https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.gerrit-to-redis/SAL : 19:46 valhallasw: grrrit-wm still not reporting anything; checking redis stream 19:50 valhallasw: gerrit-to-redis reconnects to gerrit every two hours, but does not push to clients (last 'pushed to 1 clients' message was 26-10-2014 04:00:41,241) 19:51 valhallasw: and gerrit-to-redis does not seem to be running ATM - qstat is empty 19:53 valhallasw: job resubmitted; now running on continuous@tools-exec-10; SSH and redis connections are open 19:57 valhallasw: "2014-10-26 19:56:30,333 Pushed to 1 clients" 2) grrrit-wm could not join #wikimedia-growth, and does not handle that correctly: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lolrrit-wm/SAL 20:02 valhallasw: 2014-10-26T20:01:00.911Z - error: prefix=hitchcock.freenode.net, server=hitchcock.freenode.net, command=err_channelisfull, rawCommand=471, commandType=error, args=[grrrit-wm1, #wikimedia-growth, Cannot join channel (+l) - channel is full, try again later] 20:02 valhallasw: but I'm not sure if thta's the actual cause... 20:11 valhallasw: kicked #wikimedia-growth from the config, let's see if that helps 20:15 valhallasw: judging from the log, that seems to have been the issue indeed The following is in the log: 2014-10-26T02:51:40.222Z - info: joining channels 0=#mediawiki-i18n, 1=#mediawiki-parsoid, 2=#mediawiki-visualeditor, 3=#pywikibot, 4=#semantic-mediawiki, 5=#wikimedia-analytics, 6=#wikimedia-dev, 7=#wikimedia-fundraising, 8=#wikimedia-corefeatures, 9=#wikimedia-labs, 10=#wikimedia-mobile, 11=#wikimedia-operations, 12=#wikimedia-qa, 13=#wikidata, 14=#wikimedia-growth, 15=#wikimedia-multimedia, 16=#wikipedia-en-ambassadors, 17=#wmt, 18=#brickimedia, 19=#mediawiki-feed 2014-10-26T02:51:49.403Z - info: Joined channel #mediawiki-i18n (...) 2014-10-26T02:51:58.412Z - error: prefix=wilhelm.freenode.net, server=wilhelm.freenode.net, command=err_channelisfull, rawCommand=471, commandType=error, args=[grrrit-wm, #wikimedia-growth, Cannot join channel (+l) - channel is full, try again later] 2014-10-26T02:51:59.411Z - info: Joined channel #wikimedia-multimedia (...) 2014-10-26T02:52:03.426Z - info: Joined channel #mediawiki-feed 2014-10-26T02:53:52.211Z - info: joining channels 0=#mediawiki-i18n, 1=#mediawiki-parsoid, 2=#mediawiki-visualeditor, 3=#pywikibot, 4=#semantic-mediawiki, 5=#wikimedia-analytics, 6=#wikimedia-dev, 7=#wikimedia-fundraising, 8=#wikimedia-corefeatures, 9=#wikimedia-labs, 10=#wikimedia-mobile, 11=#wikimedia-operations, 12=#wikimedia-qa, 13=#wikidata, 14=#wikimedia-growth, 15=#wikimedia-multimedia, 16=#wikipedia-en-ambassadors, 17=#wmt, 18=#brickimedia, 19=#mediawiki-feed (repeats) After kicking #wikimedia-growth out: 2014-10-26T20:12:23.103Z - info: joining channels 0=#mediawiki-i18n, 1=#mediawiki-parsoid, 2=#mediawiki-visualeditor, 3=#pywikibot, 4=#semantic-mediawiki, 5=#wikimedia-analytics, 6=#wikimedia-dev, 7=#wikimedia-fundraising, 8=#wikimedia-corefeatures, 9=#wikimedia-labs, 10=#wikimedia-mobile, 11=#wikimedia-operations, 12=#wikimedia-qa, 13=#wikidata, 14=#wikimedia-multimedia, 15=#wikipedia-en-ambassadors, 16=#wmt, 17=#brickimedia, 18=#mediawiki-feed 2014-10-26T20:12:32.315Z - info: Joined channel #mediawiki-i18n (..) 2014-10-26T20:12:45.365Z - info: Joined channel #mediawiki-feed 2014-10-26T20:12:45.366Z - info: Joined 19 channels. Starting relay 2014-10-26T20:12:51.784Z - info: Sent message from apps/android/wikipedia to #wikimedia-mobile So, instead of retrying to join *all* channels when one fails, the bot should rather set up the relay for the joined channels, and try re-joining the failed channels.
I can't get the patch in from tools and the patch uploader has memory issues, so here's the patch: commit 419e36d490f14d623633e79cf54dc46bb15dbd94 Author: Merlijn van Deen <valhallasw@gmail.com> Date: Sun Oct 26 20:23:16 2014 +0000 Bug 72523: Remove #wikimedia-growth diff --git a/config.yaml b/config.yaml index 01797e0..adb052d 100644 --- a/config.yaml +++ b/config.yaml @@ -126,10 +126,6 @@ channels: "mediawiki/extensions/ValueView", "mediawiki/extensions/Capiunto" } - "#wikimedia-growth": { - "mediawiki/extensions/GuidedTour", - "mediawiki/extensions/GettingStarted" - } "#wikimedia-multimedia": { "mediawiki/extensions/CommonsMetadata.*", "mediawiki/extensions/MultimediaViewer.*",
Thanks, Merlijn!
Change 168927 had a related patch set uploaded by Legoktm: Bug 72523: Remove #wikimedia-growth https://gerrit.wikimedia.org/r/168927
Change 168927 merged by jenkins-bot: Bug 72523: Remove #wikimedia-growth https://gerrit.wikimedia.org/r/168927
Leaving this open as the actual bot still needs fixing...