Last modified: 2014-04-27 17:18:41 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T61977, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 59977 - Spambots is outdated and gives false positives


Summary:	Spambots is outdated and gives false positives

Status:	ASSIGNED

Product:	Tool Labs tools
Classification:	Unclassified
Component:	WMT bots (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	PiRSquared17

URL:
Whiteboard:
Keywords:

Depends on:	60858
Blocks:
	Show dependency tree / graph

Reported:	2014-01-12 19:22 UTC by John F. Lewis
Modified:	2014-04-27 17:18 UTC (History)
CC List:	6 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description John F. Lewis 2014-01-12 19:22:54 UTC

The spambot's regexes currently are extremely out of date (1+ years old) and a majority of the warnings are false or reasons for the matching of 'spambots' is literally two closely named accounts.

Task list;
[ ] Check if the current regexes have any value anymore
[ ] Update it with new spambots or LTAs. AbuseFilters may help here.

Comment 1 PiRSquared17 2014-02-05 03:38:43 UTC

More ideas:
* AbuseLog
* Steal data from #cvn-sw-spam (LiWa and COIBot)
* Users/IPs creating other users' talk pages
* User-definable regexes, etc. (depends on bug 60858)
* Same page name or edit summary being used xwiki

Comment 2 PiRSquared17 2014-02-05 03:39:35 UTC

(In reply to comment #1)
> * Users/IPs creating other users' talk pages
Meant to say "Users/IPs creating other users' user pages"

Comment 3 Quentinv57 2014-02-06 05:45:24 UTC

(In reply to comment #1)
> More ideas:
> * AbuseLog
> * Steal data from #cvn-sw-spam (LiWa and COIBot)
> * Users/IPs creating other users' talk pages
> * User-definable regexes, etc. (depends on bug 60858)
> * Same page name or edit summary being used xwiki

We should think about an automatic way to set these regexes. Or at least semi-automatic.

It is indeed easy to know if a user is a spambot, as it is written in the lock reason entry. So the bot could learn himself if a user is a spambot or not, depending on used patterns and how he generally behaves.

Comment 4 PiRSquared17 2014-03-01 16:40:59 UTC

(In reply to Quentinv57 from comment #3)
> (In reply to comment #1)
> > More ideas:
> > * AbuseLog
> > * Steal data from #cvn-sw-spam (LiWa and COIBot)
> > * Users/IPs creating other users' talk pages
> > * User-definable regexes, etc. (depends on bug 60858)
> > * Same page name or edit summary being used xwiki
> 
> We should think about an automatic way to set these regexes. Or at least
> semi-automatic.
> 
> It is indeed easy to know if a user is a spambot, as it is written in the
> lock reason entry. So the bot could learn himself if a user is a spambot or
> not, depending on used patterns and how he generally behaves.

Adding 60858 as dependency since users should be able to add patterns manually as well.

Good ideas.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links