Last modified: 2012-11-05 17:58:26 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T42672, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 40672 - Abuse filter: Increase 5% limit to allow filtering for very short posts
Abuse filter: Increase 5% limit to allow filtering for very short posts
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
ArticleFeedbackv5 (Other open bugs)
unspecified
All All
: Highest normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-01 21:16 UTC by Fabrice Florin
Modified: 2012-11-05 17:58 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Fabrice Florin 2012-10-01 21:16:36 UTC
We would like to prevent users from posting feedback that is less than 5 characters long.

Filter editor Sole Soul moved some code from filter 460 to filter 458 to make that happen:
http://en.wikipedia.org/wiki/Special:AbuseFilter/463

However, that filter hit the limit of >5% of actions very quickly, and was automatically disabled (perhaps unfairly,see related Bug 37615).

To address this issue, Oliver and I tried filtering posts with different numbers of characters, but these actions still resulted in the filter being automatically disabled. 

We would like to find a solution to this issue, either by changing the Regex script, or by solving the related issue for Bug 37615:
https://bugzilla.wikimedia.org/show_bug.cgi?id=37615

You can read more about abuse filter for article feedback on our feature
requirements page:
http://www.mediawiki.org/wiki/Article_feedback/Version_5/Feature_Requirements#Abuse.2FSpam_Filters
Comment 1 Andre Klapper 2012-10-22 17:53:33 UTC
(In reply to comment #0)
> We would like to find a solution to this issue, either by changing the Regex
> script, or by solving the related issue for Bug 37615

Fabrice: Who exactly would need to be in for agreement here?
Comment 2 Fabrice Florin 2012-10-22 21:29:13 UTC
Hi Andre, 

I believe we will need to make a modification to the Abuse filter extension to increase the cutoff for disabling articles to a higher value than 5% -- possibly up to 10%. 

Right now, filter 458 only disallows posts with 2 characters or less, because it gets automatically disabled if we try 3, 4 or 5 characters. We really want to disallow posts with 5 characters or less ASAP, and ultimately even 10 characters or less. One way to accomplish that is to increase the value for disabling articles.

To quote extension creator Andrew Garrett: "The AbuseFilter has a special mechanism for new filters in which filters that match more than X% of the actions that they are compared against are disabled. It is presumed that any filter that matches more than X% of actions is out of control. The current value of X is 5. In order to determine whether a filter matching more than X% of actions is actually out of control or just unlucky, we need a decent sample size. So the minimum sample size is Y, the variable that we changed from 2 to 25."

The goal would be to have a higher cut-off for feedback than for edits -- so we don't disrupt the current cutoff used for edits, only increase the cutoff for feedback posts …

We are now waiting for Andrew Garrett and Matthias Mullie to offer a recommendation on that point, as well as assess the complexity of this proposed revision. 

If we're only talking about a couple hours of development, I think we should do it, so we don't have to keep resetting the filters manually. I suspect that we will need a higher limit anyway before we can deploy AFT5 to 100%.
Comment 3 Matthias Mullie 2012-10-23 13:27:08 UTC
I suggest to make this configurable per "Filter group".

It makes sense to treat different kinds of "text" (e.g. articles vs feedback) differently.

I've pushed a couple of patches:

* https://gerrit.wikimedia.org/r/#/c/29570/ AbuseFilter change: make it possible for other extensions to define new emergency shutdown values per "filter group"
* https://gerrit.wikimedia.org/r/#/c/29569/ ArticleFeedback change: set different values for AFT then the current AbuseFilter defaults (which are more conservative)
* https://gerrit.wikimedia.org/r/#/c/29571/ Config change: update WMF config to use the above method

The emergency shutdown values for regular article submission would remain unchanged, the values for feedback would become:
- 10% rather than 5%
- sample size from 25 to 50

How does that sound?
Comment 4 Fabrice Florin 2012-10-23 16:40:29 UTC
Thanks, Matthias, this sounds great to me!

Andrew, do these revisions work for you as well?

If so, could you please review them and/or propose edits?

Nicely done!

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links