Last modified: 2014-08-22 14:41:41 UTC
We currently have a regexp-based newline serialization hack which converts non-IEW ws to a single space. The regexp does not work well any more with the XML serializer (as > is no longer entity-escaped). We should normalize this on the DOM instead, where IEW vs. non-IEW info is readily available.
See also bug 63195, which is a more general issue that may also include IEW normalization.
The newline normalization was turned off in commit fc153752d4a8cf3c865f02d0303c8bd1529b3162
Change 155639 had a related patch set uploaded by Cscott: Don't strip newlines within text content. https://gerrit.wikimedia.org/r/155639
Change 155639 merged by Cscott: Improve whitespace normalization for parser tests. https://gerrit.wikimedia.org/r/155639