Last modified: 2013-09-27 11:31:26 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T34159, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 32159 - urls should be decoded before regexp matching
urls should be decoded before regexp matching
Status: PATCH_TO_REVIEW
Product: MediaWiki extensions
Classification: Unclassified
Spam Blacklist (Other open bugs)
unspecified
All All
: Normal major with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
http://meta.wikimedia.org/wiki/Talk:S...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-11-02 22:44 UTC by seth
Modified: 2013-09-27 11:31 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Comment 1 seth 2011-11-05 11:36:16 UTC
The sbl extension searches for
  /https?:\/\/+[a-z0-9_\-.]*(\bexample\.com\b)

That means sbl entries always start with a domain part of a url. Actually that's ok, because google-links like the above mentioned also include full urls. The problem is that those urls are encoded (see [[w:en:Percent-encoding]]) and the sbl extension does no decoding. So 
  ...?url=http%3A%2F%2Fwww.example.com
is not resolved as 
  ...?url=http://www.example.com

Solutions could be either
1. letting the regexp pattern start not with 
  /https?:\/\/+[a-z0-9_\-.]*(/
  but with 
  /https?(?i::|%3a)(?i:\/|%2f){2,}[a-z0-9_\-.]*(/
  or
2. decoding urls before doing the regexp matching.

(The second option is better for it is more general.)
Comment 2 anubhav 2013-04-07 06:44:52 UTC
Review the patch here

https://gerrit.wikimedia.org/r/57904
Comment 3 anubhav 2013-04-07 20:24:46 UTC
Review the patch here

https://gerrit.wikimedia.org/r/#/c/57935/
Comment 4 Andre Klapper 2013-04-07 21:09:24 UTC
anubhav: Mentioning the bug number in the commit message is highly welcome.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links