Last modified: 2013-12-04 00:31:25 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T56454, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 54454 - Broken wikitext with bold inside and outside a label wrongly interpretted as correct by Parsoid
Broken wikitext with bold inside and outside a label wrongly interpretted as ...
Status: RESOLVED FIXED
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Normal major
: ---
Assigned To: ssastry
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-22 20:02 UTC by Justin Michrina
Modified: 2013-12-04 00:31 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Justin Michrina 2013-09-22 20:02:36 UTC
Example:
Line from "International_Air_Transport_Association_airport_code"

DFW for Dallas–Fort Worth, DTW for Detroit–Wayne County, RDU for Raleigh–Durham, MSP for Minneapolis–St. Paul and LBA for Leeds Bradford (Airport).

Line appears correctly formatted in editor [caps represent bold]:
dfw for Dallas-Fort Worth, dtw for DetroiT-Wayne county, rdu for Raleigh-DUrham, msp for Minneapolis-St. Paul and lba for Leeds Bradford (Airport).

Line appears incorrect on page [caps represent bold]:
dfw for DALLAS-FORT WORTH, stw for dEtROIT-wAYNE COUNTY, rdu for RALEIGH-DURHAM, msp for mINNEAPOLIS-sT. pAUL and lba for LEEDS BRADFORD (AIRPORT).

Note that the correct version appears in the editor even after it's saved.

Possibly related to bug 53208, but I didn't have time to verify.
Comment 1 James Forrester 2013-09-25 02:20:28 UTC
I have fixed the particular problem with this edit: https://en.wikipedia.org/w/index.php?title=International_Air_Transport_Association_airport_code&diff=574406042&oldid=574202912

This is caused by a bug in Parsoid, I believe, which is wrongly interpretting wikitext as valid when it should be invalid. Specifically:

  '''[[Foo|F'''oo '''B'''ar]]

… as …

  <b><a href=Foo>F</a></b><a href=Foo>oo </a><b><a href=Foo>B</a></b><a href=Foo>ar</a>

… whereas it should result in broken HTML, per the PHP parser (post-Tidy), as:

  <b><a href=Foo>F<b>oo </b>B<b>ar</b></a>

The broken wikitext was added in https://en.wikipedia.org/w/index.php?title=International_Air_Transport_Association_airport_code&diff=568156811&oldid=568156261 which isn't tagged as a VisualEditor edit but possibly could have been (and a secondary bug means it isn't tagged) - will investigate separately.
Comment 2 ssastry 2013-10-02 02:26:33 UTC
Tidy should result in non-broken HTML (unless there is a bug in Tidy). I just checked and Tidy fixes PHP parser's broken HTML and generates: <p><b><a href="/wiki/Foo" title="Foo" class="mw-redirect">F<b>oo</b> B<b>ar</b></a></b></p> (Can be verifed at https://en.wikipedia.org/wiki/User:Ssastry/sandbox)

As for Parsoid, yes, this kind of broken wikitext was not being handled properly so far, but that is set to change with https://gerrit.wikimedia.org/r/#/c/83216/ which is awaiting review.

That patch generates the following HTML on the snippet which is similar to what Tidy generates

<p data-parsoid='{"dsr":[0,27,0,0]}'><b data-parsoid='{"autoInsertedEnd":1,"dsr":[0,27,3,0]}'><a rel="mw:WikiLink" href="./Foo" data-parsoid='{"stx":"piped","a":{"href":"./Foo"},"sa":{"href":"Foo"},"dsr":[3,27,6,2]}'>F<b data-parsoid='{"dsr":[10,19,3,3]}'>oo </b>B<b data-parsoid='{"autoInsertedEnd":1,"dsr":[20,25,3,0]}'>ar</b></a></b></p>
Comment 3 Gabriel Wicke 2013-12-04 00:31:25 UTC
This has long been merged, so closing as fixed. Please reopen if there is still an issue.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links