Last modified: 2014-10-28 20:46:51 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T72894, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 70894 - Category links that show up as modified and have a "./" in the <link> href serialize to Badtitletext
Category links that show up as modified and have a "./" in the <link> href se...
Status: RESOLVED FIXED
Product: Parsoid
Classification: Unclassified
serializer (Other open bugs)
unspecified
All All
: Highest major
: ---
Assigned To: Parsoid Team
:
Depends on:
Blocks: 70897
  Show dependency treegraph
 
Reported: 2014-09-16 18:16 UTC by Roan Kattouw
Modified: 2014-10-28 20:46 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Roan Kattouw 2014-09-16 18:16:54 UTC
$ echo '<link rel="mw:PageProp/Category" href="./Category:Toxine_bactérienne"/><link rel="mw:PageProp/Category" href="./Category:Toxine_bact%C3%A9rienne"/>' | node tests/parse.js --html2wt --apiURL=http://fr.wikipedia.org/w/api.php

[[MediaWiki:Badtitletext]]
[[MediaWiki:Badtitletext]]

Serialization works correctly if the href matches data-parsoid, but only if the client hasn't URL-encoded the href. This is why VE is introducing corruption like https://fr.wikipedia.org/w/index.php?title=Exotoxine&diff=prev&oldid=107508831 , but only in Firefox because Firefox URL-encodes é in hrefs whereas Chrome doesn't.
Comment 1 ssastry 2014-09-16 18:21:20 UTC
A quick look at wts.LinkHandler.js reveals that modified links go through wikilink content escaping which is where this gets tripped up. And modification detection is based on data-parsoid inspection and comparing with href, etc.

So, something is broken in state.env.isValidLinkTarget(linkTarget) function (used in escapeWikiLinkContentString).
Comment 2 Roan Kattouw 2014-09-16 18:32:03 UTC
(In reply to ssastry from comment #1)
> A quick look at wts.LinkHandler.js reveals that modified links go through
> wikilink content escaping which is where this gets tripped up. And
> modification detection is based on data-parsoid inspection and comparing
> with href, etc.
> 
> So, something is broken in state.env.isValidLinkTarget(linkTarget) function
> (used in escapeWikiLinkContentString).

This is also broken for links with special characters whose hrefs then get URL-encoded. This leads to the links being normalized to underscore form.

$ echo '<a href="../Le_Maillon_faible_%28jeu_t%C3%A9l%C3%A9vis%C3%A9%29" rel="mw:WikiLink" data-parsoid="{&quot;stx&quot;:&quot;piped&quot;,&quot;a&quot;:{&quot;href&quot;:&quot;../Le_Maillon_faible_(jeu_télévisé)&quot;},&quot;sa&quot;:{&quot;href&quot;:&quot;Le Maillon faible (jeu télévisé)&quot;},&quot;dsr&quot;:[133,184,35,2]}" title="Le Maillon faible (jeu télévisé)">Maillon faible</a>' | node tests/parse.js --html2wt --prefix frwiki

[[Le Maillon_faible_(jeu_télévisé)|Maillon faible]]

$ echo '<a href="../Le_Maillon_faible_(jeu_télévisé)" rel="mw:WikiLink" data-parsoid="{&quot;stx&quot;:&quot;piped&quot;,&quot;a&quot;:{&quot;href&quot;:&quot;../Le_Maillon_faible_(jeu_télévisé)&quot;},&quot;sa&quot;:{&quot;href&quot;:&quot;Le Maillon faible (jeu télévisé)&quot;},&quot;dsr&quot;:[133,184,35,2]}" title="Le Maillon faible (jeu télévisé)">Maillon faible</a>' | node tests/parse.js --html2wt --prefix frwiki

[[Le Maillon faible (jeu télévisé)|Maillon faible]]
Comment 3 Roan Kattouw 2014-09-16 18:38:18 UTC
I thought it was weird the first space didn't get converted to an underscore there, but that seems to be happening in general:

$ echo '<a href="./Le_Maillon_faible_(jeu_télévisé)" rel="mw:WikiLink">Maillon faible</a>' | node tests/parse.js --html2wt --prefix frwiki

[[Le Maillon_faible_(jeu_télévisé)|Maillon faible]]

Happens without ./ too
Comment 4 Gerrit Notification Bot 2014-09-16 19:14:18 UTC
Change 160795 had a related patch set uploaded by Subramanya Sastry:
(Bug 70894) Fix bugs serializing modified wikilinks

https://gerrit.wikimedia.org/r/160795
Comment 5 Gerrit Notification Bot 2014-09-17 19:11:18 UTC
Change 160795 merged by jenkins-bot:
(Bug 70894) Fix bugs serializing modified wikilinks

https://gerrit.wikimedia.org/r/160795
Comment 6 Gerrit Notification Bot 2014-09-18 00:11:34 UTC
Change 161141 had a related patch set uploaded by Subramanya Sastry:
(Bug 70894) Fix regressions introduced by 6e302233 (found in RT-testing)

https://gerrit.wikimedia.org/r/161141
Comment 7 Gerrit Notification Bot 2014-09-19 19:40:56 UTC
Change 161141 merged by jenkins-bot:
(Bug 70894) Fix regressions introduced by 6e302233 (found in RT-testing)

https://gerrit.wikimedia.org/r/161141
Comment 8 Gerrit Notification Bot 2014-09-26 22:12:27 UTC
Change 163292 had a related patch set uploaded by Subramanya Sastry:
New parser tests for lang/category/wiki links (wt2wt and html2wt modes)

https://gerrit.wikimedia.org/r/163292
Comment 9 Gerrit Notification Bot 2014-10-28 20:46:51 UTC
Change 163292 merged by jenkins-bot:
New parser tests for lang/category/wiki links (wt2wt and html2wt modes)

https://gerrit.wikimedia.org/r/163292

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links