Last modified: 2014-08-26 14:34:01 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T71775, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 69775 - Spam blacklist disallows addition of blacklisted links which are already there
Spam blacklist disallows addition of blacklisted links which are already there
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
Spam Blacklist (Other open bugs)
unspecified
All All
: Low normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-08-20 09:14 UTC by Dirk Beetstra
Modified: 2014-08-26 14:34 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Dirk Beetstra 2014-08-20 09:14:30 UTC
Pages which contain blacklisted links can be edited normally.  One can also add an exact duplicate of the blacklisted link, elsewhere in the edit.  For example, see https://en.wikipedia.org/w/index.php?title=Eric_Guthrie&diff=prev&oldid=621871444,  where CyberBot II is tagging a page containing the blacklisted link http://cfl-scrapbook.no-ip.org/CFL-CanadianQB.php by re-adding the exact same link in a template in the top.  Both the old and the newly added link is clickable (see https://en.wikipedia.org/w/index.php?title=Eric_Guthrie&oldid=621871444, click 'show' in the top template).

This is the expected behaviour of the spam-blacklist extension.

However, doing the same edits, CyberBot II is continuously being blocked on 4 pages, where this behaviour is not taking place, see https://en.wikipedia.org/w/index.php?title=Special%3ALog&type=spamblacklist&user=Cyberbot_II&page=&year=&month=-1&tagfilter=&hide_patrol_log=1&hide_review_log=1&hide_thanks_log=1

Reproducing the behaviour, taking http://my.mail.ru/mail/sekhmet_oko/#page=/mail/sekhmet_oko/info? and adding it manually and subsequently trying to save the page also results in a block by the spam blacklist: see http://my.mail.ru/mail/sekhmet_oko/#page=/mail/sekhmet_oko/info?, last two items by Beetstra
Comment 1 Dirk Beetstra 2014-08-20 09:15:35 UTC
Copy of log CyberBot II (https://en.wikipedia.org/w/index.php?title=Eric_Guthrie&diff=prev&oldid=621871444)

11:30, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
 11:07, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Belfast Harbour by attempting to add http://belfast.ports-guides.com.
 11:05, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Chris Bell (politician) by attempting to add http://www.political.com.
 11:04, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Batik by attempting to add http://www.samuibatik.com.
 08:45, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
 08:23, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Belfast Harbour by attempting to add http://belfast.ports-guides.com.
 08:20, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Chris Bell (politician) by attempting to add http://www.political.com.
 08:19, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Batik by attempting to add http://www.samuibatik.com.
 05:54, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
 05:31, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Belfast Harbour by attempting to add http://belfast.ports-guides.com.
 05:29, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Chris Bell (politician) by attempting to add http://www.political.com.
 05:27, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Batik by attempting to add http://www.samuibatik.com.
 02:55, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
 02:32, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Belfast Harbour by attempting to add http://belfast.ports-guides.com.
 02:29, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Chris Bell (politician) by attempting to add http://www.political.com.
 02:28, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Batik by attempting to add http://www.samuibatik.com.
 23:07, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
 22:22, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Belfast Harbour by attempting to add http://belfast.ports-guides.com.
 22:18, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Chris Bell (politician) by attempting to add http://www.political.com.
 22:16, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Batik by attempting to add http://www.samuibatik.com.
 18:45, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
 18:12, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Belfast Harbour by attempting to add http://belfast.ports-guides.com.
 18:08, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Chris Bell (politician) by attempting to add http://www.political.com.
 18:06, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Batik by attempting to add http://www.samuibatik.com.
 14:20, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
 13:37, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Belfast Harbour by attempting to add http://belfast.ports-guides.com.
 13:33, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Chris Bell (politician) by attempting to add http://www.political.com.
 13:31, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Batik by attempting to add http://www.samuibatik.com.
 10:47, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
 10:23, 19 August 2014 Cyberbot II (talk | contribs | block) caused a spam blacklist hit on Belfast Harbour by attempting to add http://belfast.ports-guides.com.
Comment 2 Dirk Beetstra 2014-08-20 09:16:39 UTC
Copy of relevant items from Beetstra's log (https://en.wikipedia.org/w/index.php?title=Special%3ALog&type=spamblacklist&user=Beetstra&page=&year=&month=-1&tagfilter=&hide_patrol_log=1&hide_review_log=1&hide_thanks_log=1 - link copied wrongly above!!):

 09:16, 19 August 2014 Beetstra (talk | contribs | block) caused a spam blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
 09:15, 19 August 2014 Beetstra (talk | contribs | block) caused a spam blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
Comment 3 Dirk Beetstra 2014-08-20 09:18:50 UTC
Hmm, making many wrong links here - for below message, the log is at https://en.wikipedia.org/w/index.php?title=Special%3ALog&type=spamblacklist&user=Cyberbot_II&page=&year=&month=-1&tagfilter=&hide_patrol_log=1&hide_review_log=1&hide_thanks_log=1

(In reply to Dirk Beetstra from comment #1)
> Copy of log CyberBot II
> (https://en.wikipedia.org/w/index.
> php?title=Eric_Guthrie&diff=prev&oldid=621871444)
> 
> 11:30, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam
> blacklist hit on Elena Sheynina by attempting to add http://my.mail.ru.
>  11:07, 20 August 2014 Cyberbot II (talk | contribs | block) caused a spam
> blacklist hit on Belfast Harbour by attempting to add
.......
Comment 4 Jackmcbarn 2014-08-22 17:00:59 UTC
I'm almost positive that the problem lies with Cyberbot II and not the blacklist. My best guess is that the URLs contain some sort of special character that makes parsing of the bare version end sooner, so the bare version doesn't match the existing version.
Comment 5 Dirk Beetstra 2014-08-24 04:38:25 UTC
(In reply to Jackmcbarn from comment #4)
> I'm almost positive that the problem lies with Cyberbot II and not the
> blacklist. My best guess is that the URLs contain some sort of special
> character that makes parsing of the bare version end sooner, so the bare
> version doesn't match the existing version.

@Jackmcbarn: have you read the items in Comment 3?  Please try to copy-and-paste the blacklisted link (the one starting with http://my.mail.ru in [[Elena Sheynina]], pasting it into a new empty section and save the page.  The exact link that is there is then blocked.  Reverting this to new, as what I did was confirm that the bot's blocked edits are a problem that is also shown to others (that is, me), and likely also blocks legitimate edits elsewhere.
Comment 6 Jackmcbarn 2014-08-24 04:40:53 UTC
I just did it, and it wasn't blocked. See https://en.wikipedia.org/w/index.php?title=Elena_Sheynina&diff=622558252&oldid=621640815
Comment 7 Jackmcbarn 2014-08-24 04:42:16 UTC
I think the problem you're seeing is because the link ends with a question mark, and if used as a bare link, the question mark isn't considered part of it, which does indeed make it a different link and correctly disallowed.
Comment 8 Dirk Beetstra 2014-08-24 04:50:17 UTC
So what blocked MY edits there, I copy-pasted the link as well (I tried another route, and was able to save as well).  Something in the parsing seems to be strange.
Comment 9 Jackmcbarn 2014-08-24 15:34:39 UTC
The original link in the page ended with a question mark. When you added the link, it was bare, so t he question mark wasn't picked up as part of it. See https://en.wikipedia.org/w/index.php?oldid=622612947 for an example. The second link there was basically what was already in the article, and the first one was the one you tried to add. Note that the question mark isn't part of it. To accomplish what you're trying to do, you'd need to add the link the way the third one there does. Unless there's anything I'm still missing, this is RESOLVED WORKSFORME.
Comment 10 Dirk Beetstra 2014-08-26 06:42:00 UTC
No, it is not resolved.  It gets weirder and weirder.  I tried to add:

'{{Blacklisted-links|1=
*http://my.mail.ru/mail/sekhmet_oko/#page=/mail/sekhmet_oko/info?
*:''Triggered by <code>\bmy\.mail\.ru\b</code> somewhere''|bot=Cyberbot II|invisible=false}}'

(the template that the bot is supposed to leave) - that does NOT work

Also adding 

'*http://my.mail.ru/mail/sekhmet_oko/#page=/mail/sekhmet_oko/info?'

does NOT work

However adding:

'*[http://my.mail.ru/mail/sekhmet_oko/#page=/mail/sekhmet_oko/info? link]

does work ([https://en.wikipedia.org/w/index.php?title=Elena_Sheynina&diff=622848973&oldid=622558855 diff]).

There is a difference in how the links are parsed, and which are blocked and not - see.  It may still be a problem on the link itself, but this difference should not exist.
Comment 11 Dirk Beetstra 2014-08-26 07:40:10 UTC
I see that is what you also show in your sandbox edit.
(In reply to Dirk Beetstra from comment #10)
> No, it is not resolved.  It gets weirder and weirder.  I tried to add:
> 
> '{{Blacklisted-links|1=
> *http://my.mail.ru/mail/sekhmet_oko/#page=/mail/sekhmet_oko/info?
> *:''Triggered by <code>\bmy\.mail\.ru\b</code> somewhere''|bot=Cyberbot
> II|invisible=false}}'
> 
> (the template that the bot is supposed to leave) - that does NOT work
> 
> Also adding 
> 
> '*http://my.mail.ru/mail/sekhmet_oko/#page=/mail/sekhmet_oko/info?'
> 
> does NOT work
> 
> However adding:
> 
> '*[http://my.mail.ru/mail/sekhmet_oko/#page=/mail/sekhmet_oko/info? link]
> 
> does work
> ([https://en.wikipedia.org/w/index.
> php?title=Elena_Sheynina&diff=622848973&oldid=622558855 diff]).
> 
> There is a difference in how the links are parsed, and which are blocked and
> not - see.  It may still be a problem on the link itself, but this
> difference should not exist.
Comment 12 Jackmcbarn 2014-08-26 14:34:01 UTC
That is the exact same thing. Bare links can't end with a question mark. If you want a link to end in a question mark, you have to wrap it in square brackets. That's not a bug, though, so I'm not sure what you're saying the problem is.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links