Last modified: 2013-05-02 18:10:38 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T48837, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 46837 - [[Wikipedia]] auto-linking in Bugzilla does not work for non-ascii characters
[[Wikipedia]] auto-linking in Bugzilla does not work for non-ascii characters
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Bugzilla (Other open bugs)
wmf-deployment
All All
: Low minor (vote)
: ---
Assigned To: Bartosz Dziewoński
: easy
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-03 08:29 UTC by Andre Klapper
Modified: 2013-05-02 18:10 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Comment 1 Andre Klapper 2013-04-18 14:17:09 UTC
The current regex
  \[\[([a-zA-Z0-9_ ,./'()!#\*\$%:\x80-\xff-]+)\]\]
does not match
  [[vi:Wikipedia:Thảo luận#Patroller và autopatroller]]
due to ảậ not being included in the allowed regex character ranges.

Changing the regex to something less strict, like
  \[\[([a-zA-Z0-9_ ,./'()!#\*\$%:\u0080-\u024f\u1e00-\u1eff-]+)\]\]
would theoretically fix that, though would still not support linebreaks.

And this works when testing the regex in Firefox with Javascript.
In Bugzilla 4.2 (Perl) it does not work. 
Perl seems to not interpret \u correctly, because I also tried with 
   \u0080-\u00ff
instead of current
   \x80-\xff
which should show the same successful linking but it doesn't, and I think we don't want to allow any random characters here (though I'm not aware of any potential security implications).
Comment 2 Andre Klapper 2013-04-18 19:47:54 UTC
https://gerrit.wikimedia.org/r/#/c/54503/
Comment 3 Andre Klapper 2013-04-19 11:05:58 UTC
Refering to MatmaRex' patch:
As the automatic linking in Bugzilla has a &go=Go parameter, creating something like [[vi:Wikipedia:Thảo luận&action=whatever]] could be at minimum confusing, but no idea if this could be used in any harmful way? Might make sense to also exclude the ampersand?
Comment 4 Bartosz Dziewoński 2013-04-20 13:38:57 UTC
If anything, the search string should be URL-encoded; titles like [[C&C]] are entirely valid and should be allowed. I'll update my patch to do that, too.
Comment 5 Bartosz Dziewoński 2013-04-20 13:49:30 UTC
Done in latest patchset (the patch is Ie3dbf0a6, by the way).
Comment 6 Gerrit Notification Bot 2013-04-24 19:12:28 UTC
https://gerrit.wikimedia.org/r/54503 (Gerrit Change Ie3dbf0a68e94db15b1daacea25f443ca2392be96) | change APPROVED and MERGED [by Dzahn]
Comment 7 Bartosz Dziewoński 2013-04-24 19:24:16 UTC
Deployed by mutante, yay.
Comment 8 Gerrit Notification Bot 2013-04-24 19:49:51 UTC
Related URL: https://gerrit.wikimedia.org/r/60701 (Gerrit Change I1eadfecacc5abb7a3c12b0ee5e2ffdfffe1abd81)
Comment 9 Bartosz Dziewoński 2013-04-24 20:30:23 UTC
This seems to break links to sections, and I1eadfeca is intended to fix that: compare [[Cat]], [[Cat#Anatomy]], [[pl:Kot domowy]], [[pl:Kot domowy#Anatomia]], [[pl:Kot domowy#Wędrówki]].
Comment 10 Bartosz Dziewoński 2013-04-24 20:43:39 UTC
Alright, false alarm. Only some links break (such as the ones in bug 45979 comment 0), and apparently only on Opera, so I'm not motivated enough to look into that :). Closing.
Comment 11 Gerrit Notification Bot 2013-04-24 20:43:53 UTC
https://gerrit.wikimedia.org/r/60701 (Gerrit Change I1eadfecacc5abb7a3c12b0ee5e2ffdfffe1abd81) | change ABANDONED [by Matmarex]

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links