Last modified: 2014-07-31 10:35:25 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T70800, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 68800 - Parsoid list with newlines roundtrip issue: HTML "<ul><li>asd\nsdf</li></ul>" → Wikitext "* asd\nsdf" → HTML "<ul><li>asd</li></ul><p>sdf</p>"
Parsoid list with newlines roundtrip issue: HTML "<ul><li>asd\nsdf</li></ul>"...
Status: NEW
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Parsoid Team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-07-29 13:02 UTC by Bartosz Dziewoński
Modified: 2014-07-31 10:35 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bartosz Dziewoński 2014-07-29 13:02:21 UTC
Take this HTML:

<ul><li>asd
sdf</li></ul>

Parse to wikitext:

* asd
sdf

Parse back to HTML:

<ul><li>asd</li></ul>
<p>sdf</p>

I'm not sure what should happen here, but definitely not this.

It's rather easy to run into this in VisualEditor – take a paragraph with newlines and convert it to a list item. I ran into it making this edit: https://en.wikipedia.org/w/index.php?title=Polish_nationality_law&diff=prev&oldid=618961884 (I manually replaced the newlines with spaces before saving).
Comment 1 ssastry 2014-07-31 10:21:48 UTC
Yes, this is a known issue. Parsoid currently cannot handle arbitrary HTML and convert it to wikitext in a way that preserves rendering on the html -> wt -> html path. But, we've talked about this issue more generally in the past and will address it including fallback mechanisms where some forms of HTML will have to get serialized as HTML tags rather than native wikitext. I thought we had a tracking or related set of bugs for this but, cannot find it right now.

We should identify any other related breakages that arise from within VE (which doesn't necessarily generate arbitrary HTML) and fix them together in Parsoid. This fix would be simpler than support for generic HTML->wt conversion that is preserved in a html2html transformation.
Comment 2 Bartosz Dziewoński 2014-07-31 10:27:23 UTC
In cases like this, it would probably be reasonable to just convert newlines to spaces at some point (either in VisualEditor or in Parsoid).

Perhaps VisualEditor would be a better place to implement this from the user's perspective, but Parsoid doing what it does now would still be weird :) – maybe we should just fix this in both places?
Comment 3 ssastry 2014-07-31 10:35:25 UTC
This will need a Parsoid fix since other Parsoid users might still give it HTML that won't be preserved in the html -> wt -> html transformation. 

VE can choose to fix or not independently.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links