Last modified: 2013-01-12 22:24:52 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T40391, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 38391 - spam prevention on beta.wmflabs.org


Summary:	spam prevention on beta.wmflabs.org

Status:	RESOLVED FIXED

Product:	Wikimedia Labs
Classification:	Unclassified
Component:	deployment-prep (beta) (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2012-07-14 05:10 UTC by Jasper Deng
Modified:	2013-01-12 22:24 UTC (History)
CC List:	11 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Jasper Deng 2012-07-14 05:10:11 UTC

The amount of spam accounts on the beta.wmflabs.org wikis is staggering, partly because CheckUser is not available to root out their IP addresses as they are for the main WMF wikis.

If for some reason, CheckUser cannot be enabled, account creation should either go through the ConfirmAccount extension or otherwise require approval from existing users, because this spam is too much to deal with.

Comment 1 Sam Reed (reedy) 2012-07-14 12:52:54 UTC

CheckUser cannot be enabled due to disclosure of users' ip addresses to typically non privelged users.

Comment 2 Jasper Deng 2012-07-14 18:47:30 UTC

A request page on the main Meta can be made for those who want accounts to request one; that'll definitely stop all the spammers.

Comment 3 Peter Bena 2012-07-16 07:51:31 UTC

Creating such a requirement would make the site almost unusable, we should rather think of some advanced extension that would effectively block these. Despite this is annoying, the production is having same problem, only difference is that there are way more people who deal with it. If we made a system that effectively block spammers, we could eventually use it on production as well.

Having the RC feed is a first step, I will try to set up a relay to freenode today so that we can be easily notified on spammers.

On other hand, the shell access is restricted right now on beta. Most of users who have it are either wmf employees or identified to foundation. So we could eventually enable checkuser for some time if you believe that this is only reason, why fighting spammers is harder on beta than on production, but this needs to be discussed with Ryan

Comment 4 Jasper Deng 2012-07-17 01:18:32 UTC

Without CheckUser, there's little the stewards can do.

Comment 5 Ryan Lane 2012-07-17 01:37:54 UTC

The stewards asked for checkuser to be removed...

Comment 6 Jasper Deng 2012-07-17 01:40:06 UTC

My point is that it's not a lack of patrolling users that leads to out-of-control spam, it's the fact that we can't block the underlying IPs except by autoblock; it's also because extensions like AbuseFilter and Titleblacklist don't appear to be functional.

Comment 7 Ryan Lane 2012-07-17 01:46:48 UTC

Indeed. It's non-ideal.

For sure, if AbuseFilter and Titleblacklist aren't working, bugs should be entered for that.

Comment 8 Jasper Deng 2012-07-17 01:47:52 UTC

I don't really think it was the intention to have them work - after all, everything really is just copied from production wikis, right?

Comment 9 Sam Reed (reedy) 2012-07-17 01:48:56 UTC

(In reply to comment #8)
> I don't really think it was the intention to have them work - after all,
> everything really is just copied from production wikis, right?

What's the point of having a test environment if we don't test tools such as spam prevention etc?


Only CheckUser has been specifically disabled for a reason

Comment 10 Jasper Deng 2012-07-17 01:54:29 UTC

https://bugzilla.wikimedia.org/show_bug.cgi?id=38433 filed.

Comment 11 MZMcBride 2012-07-17 01:55:33 UTC

I've changed this bug's summary to "Disable anonymous account creation on beta.wmflabs.org" from "Disable anonymous account creation". Please update the summary if I've misunderstood the request.

Comment 12 Ryan Lane 2012-07-17 01:57:21 UTC

I propose a reverse method of spam cleanup for the test/dev wikis (in addition to all of the normal vandal fighting tools):

Any article/revision not tagged "NOTSPAM" will be deleted/reverted within 24 hours of its creation by a bot.

Does this seem like a legitimate way to handle SPAM seeing as that these are test/dev wikis and the content doesn't hold much value?

Comment 13 MZMcBride 2012-07-17 02:11:04 UTC

(In reply to comment #12)
> I propose a reverse method of spam cleanup for the test/dev wikis (in addition
> to all of the normal vandal fighting tools):
> 
> Any article/revision not tagged "NOTSPAM" will be deleted/reverted within 24
> hours of its creation by a bot.
> 
> Does this seem like a legitimate way to handle SPAM seeing as that these are
> test/dev wikis and the content doesn't hold much value?

Seems reasonable to me, though I'd suggest __NOTSPAM__ so that you can track the pages easily using the built-in magic word tracking page_props table.

And rather than a bot, the wiki could just delete the pages itself. You could just put logic in the code that viewing the page when it's older than 24 hours deletes it. Extension:SelfDestruct or whatever.

Comment 14 Ryan Lane 2012-07-17 02:12:55 UTC

(In reply to comment #13)
> Seems reasonable to me, though I'd suggest __NOTSPAM__ so that you can track
> the pages easily using the built-in magic word tracking page_props table.
> 
> And rather than a bot, the wiki could just delete the pages itself. You could
> just put logic in the code that viewing the page when it's older than 24 hours
> deletes it. Extension:SelfDestruct or whatever.

Well, we don't want to change existing page content or how MediaWiki functions (since this is supposed to be used for testing MediaWiki as close to production as possible), so a bot is likely safer, and using tagging keeps the page content clean.

Comment 15 Jasper Deng 2012-07-17 03:05:39 UTC

I don't believe a bot would be productive; it wouldn't stop the mass creation of spam accounts that pollute the user lists (not all spam accounts match the regex I tried to use in my abuse filter and title blacklist entries).

I also don't see how fully turning off account creation by anonymous users would make the wiki harder to use, especially if the deployment wiki alone was also given the ConfirmAccount extension. Existing global sysops/stewards on the cluster would not have to be subject to any of this; we could also create a global account creator group if someone needs to test account creation.

Comment 16 Ryan Lane 2012-07-17 03:45:35 UTC

The concept of this project is that the wikis are configured as close to production as possible. If they aren't it ruins testing.

I'm not sure I see the problem in having polluted user listings on a wiki that isn't actually used for content. I see the problem with spam in the content, as it could lead to users clicking on bad things, though. That's why I was suggesting a bot that reverts non-positively patrolled content.

Comment 17 Peter Bena 2012-07-17 06:26:19 UTC

This is a great idea, I started work on this bot, it will be stored in git once Chad make a repo for me

Comment 18 Antoine "hashar" Musso (WMF) 2012-07-17 08:03:44 UTC

I have enabled SORBS based auto blocker with Gerrit change #15768. That should automatically block any users listed in the SORBS list of HTTP open proxies.

Comment 19 Platonides 2012-07-17 22:04:47 UTC

I just completed the setup and enabled the captcha in wmflabs. That should solve this issue.

Comment 20 Antoine "hashar" Musso (WMF) 2012-07-18 05:57:33 UTC

Rephrased bug summary.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links