Last modified: 2013-08-22 18:18:10 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T55146, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 53146 - Parsoid: Percent-encode % in URLs
Parsoid: Percent-encode % in URLs
Status: RESOLVED FIXED
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Gabriel Wicke
: easy
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-21 08:56 UTC by Kelson [Emmanuel Engelhart]
Modified: 2013-08-22 18:18 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kelson [Emmanuel Engelhart] 2013-08-21 08:56:11 UTC
A this page:
http://parsoid.wmflabs.org/ko/%ED%95%9C%EC%96%91%EB%8C%80%ED%95%99%EA%B5%90_%EC%B4%9D%ED%95%99%EC%83%9D%ED%9A%8C

You have a link with this href attribute:
href="./한양대학교_총학생회#소리없는_99%의_명예혁명"

Like you can see this is not URL encoded, the '%' sign is a reserved character and *must* be encoded IMO.
Comment 1 Gabriel Wicke 2013-08-21 16:40:56 UTC
http://tools.ietf.org/html/rfc3986#section-2.4 agrees with you. I believe we currently only percent-encode % to %25 when followed by hex chars.
Comment 2 Kelson [Emmanuel Engelhart] 2013-08-21 17:13:37 UTC
We have fixed a really old, but similar, bug in Kiwix, three week ago in HK... but whereas C++ doesn't have escape/unescape buildin functions, javascript does: (encodeURIComponent()/decodeURIComponent())... So I was a little bit surprise to catch such one!
Comment 3 Gabriel Wicke 2013-08-21 17:15:55 UTC
We only use those selectively, as the JS version also encodes chars that don't need to be encoded when using UTF8:

encodeURIComponent('ü')
'%C3%BC'
Comment 4 Gerrit Notification Bot 2013-08-22 00:00:46 UTC
Change 80318 had a related patch set uploaded by GWicke:
Bug 53146: Percent-encode fragment identifiers too

https://gerrit.wikimedia.org/r/80318
Comment 5 Gerrit Notification Bot 2013-08-22 03:43:51 UTC
Change 80318 merged by jenkins-bot:
Bug 53146: Percent-encode fragment identifiers too

https://gerrit.wikimedia.org/r/80318
Comment 6 Gabriel Wicke 2013-08-22 18:18:10 UTC
The fix is now deployed.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links