Last modified: 2014-09-11 23:30:08 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T71876, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 69876 - Parsoid: Content leaks out of {{nihongo}} template (link inside link)
Parsoid: Content leaks out of {{nihongo}} template (link inside link)
Status: NEW
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Parsoid Team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-08-21 21:27 UTC by Roan Kattouw
Modified: 2014-09-11 23:30 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Roan Kattouw 2014-08-21 21:27:50 UTC
At http://parsoid-lb.eqiad.wikimedia.org/enwiki/Japan_at_the_2012_Summer_Olympics?oldid=600210359 I observed content leaking out of a template.

The wikitext for the reference in question is:

<ref name=JAAF20120611>[http://www.jaaf.or.jp/fan/news/2012/20120611.html {{Nihongo||ロンドンオリンピック トラック・フィールド種目・競歩種目の日本代表選手|The Japanese national team of Track & Field and Race Walk at the 2012 Olympic Games}}] {{ja icon}}. Japan Association of Athletics Federations (2012-06-11). Retrieved 19 June 2012.</ref>

The resulting Parsoid HTML (cleaned up and indented for readability) is:

<li about="#cite_note-JAAF20120611-7">
  <span rel="mw:referencedBy" data-parsoid="{}">...</span>
  <a rel="mw:ExtLink" href="...">
    <i about="#mwt199" typeof="mw:Transclusion" data-mw="...">
      The Japanese national team of Track &amp; Field and Race Walk at the 2012 Olympic Games
    </i>
    <span style="font-weight: normal" about="#mwt199">
      <span typeof="mw:Entity"> </span>
      (
      <span class="t_nihongo_kanji" lang="ja">
        ロンドンオリンピック トラック・フィールド種目・競歩種目の日本代表選手
      </span>
      <sup class="t_nihongo_help noprint"></sup>
    </span>
  </a>
  <a rel="mw:WikiLink" href="..." data-parsoid="misnested">
    <span class="t_nihongo_icon" style="..." data-parsoid="misnested">
      ?
    </span>
  </a>
  )
  <link rel="mw:PageProp/Category" href="..." data-parsoid="misnested">
  <span class="languageicon" style="..." about="#mwt137" typeof="mw:Transclusion" data-mw="...">
    (Japanese)
  </span>
  <link rel="mw:PageProp/Category" href="..." about="#mwt137">
  . Japan Association of Athletics Federations (2012-06-11). Retrieved 19 June 2012.
</li>

Note that all of the elements that have data-parsoid=misnested on them were generated by the template, but were not part of the template's about group or marked in any other way as having been template-generated. This means that if the user edits the reference (or a bug in VE causes a whitespace edit, see bug 69861), you'll get a dirty diff that expands the second half of the template, like https://en.wikipedia.org/w/index.php?curid=31216768&diff=621842944&oldid=600210359 .

The core of the problem seems to be that {{nihongo}} was used inside of a link, and the template outputs a link itself (a linked, superscripted question mark linking to a help page), so Parsoid was asked to put a link inside of a link and tried to clean that up. Unfortunately it looks like this misnesting cleanup loses template information.
Comment 1 ssastry 2014-08-21 21:40:42 UTC
This following simplified example reproduced this issue:

[subbu@earth lib] echo '[http://example.com {{echo|legal stuff [[Help:help|help]]}}]' | node parse | node parse --html2wt
[http://example.com {{echo|legal stuff [[Help:help|help]]}}][[Help:help|help]]

We should try to expand the scope of template affected content to include the outer ext-link which will prevent this kind of corruption on serialization.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links