Last modified: 2014-10-21 20:44:39 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T73547, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 71547 - Links with non-ASCII characters do not work
Links with non-ASCII characters do not work
Status: RESOLVED FIXED
Product: OCG
Classification: Unclassified
PDF renderer (Other open bugs)
unspecified
All All
: High major
: ---
Assigned To: C. Scott Ananian
: i18n
: 71589 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-10-02 07:42 UTC by Michael M.
Modified: 2014-10-21 20:44 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Michael M. 2014-10-02 07:42:32 UTC
To reproduce, go to https://de.wikipedia.org/wiki/Bundeswettbewerb_Mathematik and download it as PDF file (using the new rdf2latex writer). Open the created PDF file (I only have an Adobe Reader 10.1.2 to test) and hover over the links. Those with only ASCII characters are as expected (e.g. "Mathematikwettbewerb" links to "https://de.wikipedia.org/wiki/Mathematikwettbewerb"), but those with non-ASCII characters aren't, e.g. "Stifterverband für die Deutsche Wissenschaft" links to "file:///E|/þÿ" (this link seems to be relative to the PDF file). I also tested a random article from el.wikipedia, and all the links were messed up.
Comment 1 Michael M. 2014-10-02 07:49:30 UTC
Hm, seems to be an issue with Adobe Reader, the first online PDF-to-HTML-converter I could find, handled the links correctly. Anyway, Adobe Reader should be important enough to make the PDF files compatible to it.
Comment 2 Andre Klapper 2014-10-02 08:51:37 UTC
I can confirm the problem with evince/poppler on Linux. When clicking such a link I get:

Error when getting information for file '/var/tmp/��': No such file or directory
Comment 3 Andre Klapper 2014-10-03 10:02:34 UTC
*** Bug 71589 has been marked as a duplicate of this bug. ***
Comment 4 Michael M. 2014-10-10 07:26:41 UTC
That þÿ at the start of the link seems to be a Byte Order Mark encoded as UTF-16 BE, but interpreted as ISO/IEC 8859-1.
Comment 5 Gerrit Notification Bot 2014-10-10 08:05:04 UTC
Change 165983 had a related patch set uploaded by Cscott:
PDF can't handle UTF-8 URLs.

https://gerrit.wikimedia.org/r/165983
Comment 6 Gerrit Notification Bot 2014-10-21 16:20:09 UTC
Change 165983 merged by jenkins-bot:
PDF can't handle UTF-8 URLs.

https://gerrit.wikimedia.org/r/165983
Comment 7 C. Scott Ananian 2014-10-21 20:44:39 UTC
Fix merged and deployed.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links