Last modified: 2013-03-27 16:28:19 UTC
Implement new abuse filters to discourage more questionable comments: * Repeating characters Disallow posts where the same character is 5 times or more in a row. See this edit filter #135, which could be adapted for this feedback filter: https://en.wikipedia.org/wiki/Special:AbuseFilter/135 * No punctuation Disallow comments that have no commas, periods, colons, question marks or exclamation marks -- often a telltale sign of irrelevant contributions, according to Stack Overflow founder Jeff Anderson. * Shouting Give a warning when most of the comment is ALL CAPS (or 90% of chars.), which is usually a telltale sign of questionable feedback. This filter was implemented in April 2012 as filter #458, but disabled by King of Hearts -- and repurposed for short posts by Sole Shoe in August. I recommend we start a new feedback filter for this function, but implement it as a Warning (not Disallow) -- and only for posts that are at least 90% all-caps. FF * More bad words We should be able to filter out more offensive words than the small list we now disallow. I can do more research in coming days and email a larger list to our developer, rather than posting these swear words here. As a result, I hope we can filter more irrelevant posts than we do now (about 10% of total feedback is now filtered through this tool, and it may be possible to increase that number with a few more reliable filters). Read more about abuse filter for article feedback on our feature requirements page: http://www.mediawiki.org/wiki/Article_feedback/Version_5/Feature_Requirements#Abuse.2FSpam_Filters See also the proposed filters in the 'Under consideration' section of this abuse filter spreadsheet: https://docs.google.com/a/wikimedia.org/spreadsheet/ccc?key=0AiGAdIp7VYlbdDdKUm9naXhxOXVweWZ5YkU3Wk5lSlE#gid=0
Fabrice: Should this really be highest priority (urgent to fix within the next days)? Sounds more like an enhancement (that could have high prio though).
Hi Andrew, thanks for your note. From the standpoint of Article Feedback, this feature is our highest priority and it attempts to solve a major problem, which is that a large number of comments are inappropriate and can be effectively filtered through this tool. So I continue to view this ticket as 'major'. It is definitely more than an 'enhancement' from our perspective. But as a compromise, I have adjusted the importance to 'normal', to show that I'm not an unreasonable man. ;o) I understand that you have different labeling standards for tracking other applications through Bugzilla. If our labeling system is a serious issue for you, we coud consider moving to another bug tracking system, so we don't interfere with your ongoing processes. For example, we now use Trello for E2 project management, and could look into migrating out of Bugzilla into Trello over time, if it will make it easier for you. Please let me know if we should consider that option.
I don't see much difference in labeling standards here, actually. :) Improving/clarifying the semantics of "highest" vs "high" priority shouldn't be such a blocker that it forces you to migrate to a different system.
(In reply to comment #2) > From the standpoint of Article Feedback, this feature is our highest priority > and it attempts to solve a major problem, which is that a large number of > comments are inappropriate and can be effectively filtered through this tool. Bug 37579 and bug 42057 are also highest priority for AFT5. The question boils down to "how many highest priorities can you have at the same time" and "what does highest priority mean in comparison to high priority". Also see http://lists.wikimedia.org/pipermail/wikitech-l/2012-November/064531.html > So I continue to view this ticket as 'major'. It is definitely more than an > 'enhancement' from our perspective. But as a compromise, I have adjusted the > importance to 'normal', to show that I'm not an unreasonable man. ;o) If it's "major" priority in the sense of "Major loss of function in an important area." feel free to set priority to "major". If it provides a new functionality that was not available before in your project, it's an "enhancement" by definition.
Is the AbuseFilter really the right way to go about this? Why can't AFTv5 check the spam blacklist rather than [[Special:AbuseFilter/502]]? Using the AbuseFilter seems ok as an interim solution, but AFTv5 should have its own built-in abuse prevention methods, rather than depending on another extension.
Legoktm: why do you feel AbuseFilter should only be an interim solution? AbuseFilter accepts different "categories", so AFT entries are separate from regular text. Using AbuseFilter to filter spam has 2 distinct advantages over building something into AFT (not to mention the additional work to build it): - AbuseFilter does not require WMF intervention to add/fix/deploy new rules - Community is familiar with AbuseFilter already, so more people can contribute In addition, AFT also checks $wgSpamRegex and SpamBlacklist already.
I think I was mainly concerned about it not having build-in abuse prevention, but I wasn't aware of it checking $wgSpamRegex and SpamBlacklist (probably should have done some more reading), however I think your point about the community being more familiar with AF makes a lot of sense. However now I think using the AbuseFilter for this purpose is definitely an advantage.
*** Bug 43417 has been marked as a duplicate of this bug. ***
Repeating characters: https://en.wikipedia.org/wiki/Special:AbuseFilter/473 Was originally created by rsterbin; re-enabled. No punctuation or spaces: https://en.wikipedia.org/wiki/Special:AbuseFilter/520 Added, but not yet enabled; will only enable after the commit that'll display the name of the filter that rejected the feedback (https://gerrit.wikimedia.org/r/#/c/32208/) is merged Shouting: https://en.wikipedia.org/wiki/Special:AbuseFilter/521 Added & enabled. More bad words: https://en.wikipedia.org/wiki/Special:AbuseFilter/460 I have merged all "common vandalism" filters containing foul words into this original filter & renamed it to "foul words" (to provide clearer feedback to user whose feedback is rejected - this commit has not yet been merged) Please provide a list of additional foul words if you want to expand upon the existing list. Short posts: https://en.wikipedia.org/wiki/Special:AbuseFilter/458 I have re-enabled the filter now that the threshold has been upped. Extremely long words: https://en.wikipedia.org/wiki/Special:AbuseFilter/502 Just completing the list of currently active filters for AFT ;) Email address: https://en.wikipedia.org/wiki/Special:AbuseFilter/463 Just completing the list of currently active filters for AFT ;)
Thank you, Matthias, much appreciated! Do you need help testing this en-wiki? Perhaps Chris McMahon could help us, if he has time.