Last modified: 2014-02-12 23:40:02 UTC
Try with leading space: 1<a b>3
Confirmed as a syntactic diff. Lower priority for now.
No this is not a syntactic diff. The first whitespace triggers <pre> so any following whitespaces are displayed.
.
Yep, you are right. An extra space inside a pre renders differently, so changes the semantics.
Current status: echo -e ' 1<a\n b>3' | nodejs parse --wt2wt *********** ERROR: cs/s mismatch for node: PRE s: 1; cs: 0 ************ 1<a b>3 Notice the extra space before b>3.
This bug has been fixed for a subset of snippets. [subbu@earth tests] echo ' 1<c\n b>3' | node parse --wt2wt 1<c b>3 [subbu@earth tests] echo ' 1<a\n b>3' | node parse --wt2wt WARNING: DSR inconsistency: cs/s mismatch for node: PRE s: 1; cs: 0 1<a b>3 In the first case, 'c' is not a valid HTML tag and so it immediately gets converted to text and is handled properly. But 'a' is a valid HTML tag and is not converted to text till it hits the saniter, too late for it to be processed by the pre-handler which runs before the sanitizer.
Moving the pre-handler after the sanitizer alongwith some minor tweaks to the sanitizer code (Util.newlinesToNlTks(token-to-text)) fixes this. Not committing yet since this needs to be thought through, tested, and verified. But, recording this while the experiment is fresh in my mind in case we cannot get around to this right away.
Lowering priority as this is mostly working, and not a very common thing.