Last modified: 2014-07-25 16:29:17 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T60059, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 58059 - Adaptive pre serialization
Adaptive pre serialization
Status: NEW
Product: Parsoid
Classification: Unclassified
serializer (Other open bugs)
unspecified
All All
: Normal enhancement
: ---
Assigned To: Parsoid Team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-12-05 22:56 UTC by Gabriel Wicke
Modified: 2014-07-25 16:29 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Gabriel Wicke 2013-12-05 22:56:13 UTC
In HTML, pre can contain nested elements and formatting. This is also true in VE. In wikitext, formatting in pre is only supported when using the indent-pre serialization. The downside of the indent serialization is that 

1) HTML tags in text need to be entity-escaped (ugly wikitext), and 
2) trailing newlines require workarounds with <nowiki/> or <br>.

The best serialization strategy thus depends on the content of the pre. We should serialize modified or new pre elements to

1) html-syntax pre if the text content contains html tags but no elements, and
2) indent pre syntax if the content contains elements. Trailing newlines in HTML "<pre>foo\n\n</pre>" need to be protected with a trailing <nowiki/> as in " foo\n \n <nowiki/>".
Comment 1 Gabriel Wicke 2013-12-05 22:59:41 UTC
Another reason to pick html syntax pres it if the text-only content contains wikitext syntax that would otherwise be <nowiki>-escaped.
Comment 2 Gabriel Wicke 2013-12-05 23:07:21 UTC
We should also strip a trailing <nowiki/> token when handling an indent pre in the pre handler, so that it does not show up in HTML.
Comment 3 ssastry 2013-12-05 23:36:42 UTC
We currently lose trailing newlines. Test case below to add to parser tests.
[subbu@earth tests] echo "<pre>foo\n\n\n</pre>" |node parse --html2wt
 foo
 
[subbu@earth tests]
Comment 4 Gabriel Wicke 2013-12-05 23:37:13 UTC
Also: echo -e '<pre>foo\n\n\n\n\n\n</pre>' | node parse --wt2wt
Comment 5 Gabriel Wicke 2013-12-06 19:39:21 UTC
(In reply to comment #4)
> Also: echo -e '<pre>foo\n\n\n\n\n\n</pre>' | node parse --wt2wt

This is actually bug 50906.
Comment 6 ssastry 2014-07-25 16:29:17 UTC
This is part of the set of bugs for improving serialization support for arbitrary HTML. We should get to this, but after we get through the current round of bugs for rendering. Hence marking this as enhancement.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links