Last modified: 2013-12-05 23:30:22 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57566, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55566 - migrateTrailingNLs pass may be unnecessary: Investigate removal
migrateTrailingNLs pass may be unnecessary: Investigate removal
Status: RESOLVED INVALID
Product: Parsoid
Classification: Unclassified
DOM (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: ssastry
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-10 16:14 UTC by ssastry
Modified: 2013-12-05 23:30 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description ssastry 2013-10-10 16:14:22 UTC
migrateTrailingNLs DOM pass was needed at some point to deal with some bugs / inadequacies in our later dom passes (encap/dsr/..). I think those conditions are no longer present now. Worth experimenting with getting rid of that pass altogether and testing.
Comment 1 ssastry 2013-10-30 15:18:58 UTC
I verified that this is necessary to make sure DSR boundaries for nodes does not swallow trailing newlines in certain cases where the parser transformations have not done a clean job of closing tags (or when closing tags might be missing). 

This especially affects selser output with a bunch of bug fixes and selser improvements and simplifications coming up in a new patch.
Comment 2 Gabriel Wicke 2013-12-05 23:28:12 UTC
[15:25] <gwicke> I have a vague idea of what you might mean in https://bugzilla.wikimedia.org/show_bug.cgi?id=55566#c1, but if you have a more concrete example then that might be helpful
[15:26] <subbu> gwicke, simple way to do it is turn it off in mediawiki.DOMPostProcessor.js and look at parser test results
[15:27] <subbu> basically .. <li>foo\n<li> vs. <li>foo</li>\n  ... we simplified selser based on assuming the latter.
[15:27] <subbu> similar for other nodes as well
Comment 3 ssastry 2013-12-05 23:30:22 UTC
commit 8fa522eb2652dac8b091240037ff076db1088eec
Author: Subramanya Sastry <ssastry@wikimedia.org>
Date:   Tue Oct 29 13:28:12 2013 -0500

    Bunch of improvements and tweaks to selser/wt2wt sep handling
    
    .... <snip> ....
    
    WikitextSerializer (selser)
    ---------------------------
    * Stripping leading/trailing separators on original source of
      a selsered node is unnecessary (and actually buggy). Our DSR
      computation is tight, that is, DSR ranges for a DOM node does
      not surround leading or trailing newlines. So, the original
      source will not have any leading or trailing newlines from
      separators outside the node itself. Newline separators found
      in the DOM subtree apply to parent--firstchild and
      lastchild--parent pairs, and they ought to continue to be
      present when selser outputs the original source.
    
      Since this is being done on a node with valid DSR that has
      not been modified, not even the wrapper node could have been
      modified -- the entire subtree is completely unedited.
      So, HTML --> wikitext edits of wrappers will not go through
      this code route either.
    
      This simplifies the selser code path some more and also
      gets a whole bunch of new selser tests passing.

   .... <snip> ....

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links