Last modified: 2013-12-03 23:50:28 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T53004, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 51004 - Exclude outer whitespace from headings and list items
Exclude outer whitespace from headings and list items
Status: NEW
Product: Parsoid
Classification: Unclassified
DOM (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Gabriel Wicke
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-09 03:12 UTC by Gabriel Wicke
Modified: 2013-12-03 23:50 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Gabriel Wicke 2013-07-09 03:12:29 UTC
We currently include purely syntactic whitespace in the DOM, which makes life for VE and other clients harder than necessary. Instead, we should abstract purely syntactic whitespace and match the PHP parser's output.

Test cases:
== Foo == 
should parse to <h2>Foo</h2> instead of <h2> Foo </h2>

* foo
should parse to <ul><li>foo</li></ul> instead of <ul><li> foo</li></ul>
Comment 1 ssastry 2013-07-20 14:21:11 UTC
Isn't this a more generic problem that is not limited to lists and headings?  It seems we should trim whitespace from all first/last child text nodes of all non-pre elements. Otherwise, it doesn't really benefit VE, for example, since they would still have to maintain whitespace information and restore it on save.

This normalization will then mean only selser will be able to reserialize content without introducing dirty diffs.  If we want regular serializer to preserve whitespace, then, we have to record details of normalized whitespace in data-parsoid.
Comment 2 Gabriel Wicke 2013-12-03 23:50:28 UTC
https://gerrit.wikimedia.org/r/#/c/96790/ did some related work in the serializer, but did not change the DOM representation yet.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links