Last modified: 2014-09-30 16:05:36 UTC
In the following wikitext, a Parsoid round-trip results in the closing blockquote tag being lost: : <blockquote> foo : </blockquote> I haven't seen anything like this happen prior to the resolution of bug #64901, so I wonder if it may be a regression.
No, it is not a regression. It doesn't roundtrip in edit mode. See output (I've removed a serializer warning to remove clutter). But, selser will preserve it in most cases except probably where the line containing the end blocktag is edited. ---------- [subbu@earth lib] echo ":<blockquote>\nfoo\n:</blockquote>" | node parse --wt2wt --rtTestMode true :<blockquote> foo :</blockquote> [subbu@earth lib] echo ":<blockquote>\nfoo\n:</blockquote>" | node parse --wt2wt :<blockquote> foo : ---------- In this case, the wikitext is badly nested and the opening and closing blockquote tags are considered nested in separate dl-dt lists and the treebuilder will close the opening tag automatically within the first list and strip the closing tab from the second list. Parsoid recovers this information about fixups from the DOM and adds fixup information. In normal wt2wt mode, that information is used since the DOM could have been edited since (to fix the errors, for ex.). So, the behavior is as expected.
(In reply to ssastry from comment #1) > (I've removed a serializer warning to remove clutter). But, selser will > preserve it in most cases except probably where the line containing the end > blocktag is edited. In that case, reopening as an issue with selser, because it's not handling this right then. Here's the edit where this was actually a problem: https://en.wikipedia.org/w/index.php?title=Talk:Neil_deGrasse_Tyson&diff=627662324&oldid=627660338 The blockquote that was removed was 1000 lines away from the actual changes.
$ echo ":<blockquote>\na\n:</blockquote>\nb" | node parse --rtTestMode true ... <dl data-parsoid='{"dsr":[0,13,0,0]}'><dd data-parsoid='{"dsr":[0,13,1,0]}'><blockquote data-parsoid='{"stx":"html","autoInsertedEnd":true,"dsr":[1,13,12,0]}'></blockquote></dd></dl> <p data-parsoid='{"dsr":[14,15,0,0]}'>a</p> <dl data-parsoid='{"dsr":[16,17,0,0]}'><dd data-parsoid='{"dsr":[16,17,1,0]}'></dd></dl><meta typeof="mw:Placeholder/StrippedTag" data-parsoid='{"src":"</blockquote>","name":"BLOCKQUOTE","dsr":[17,30,null,null]}'/> <p data-parsoid='{"dsr":[31,32,0,0]}'>b</p> ... So, the dsr info from the stripped tag is lost. Perhaps, we should wrap these stripped tags in a placeholder span so that it carries the dsr information in edit mode. In an editor, it will show up as an empty span (and not sure what the editing implications are for this solution). But for this editing issue that needs resolution, this should be a straightforward fix in the markTreeBuilderFixups pass. I am marking this low priority since this is not a big issue as far as I can tell. Feel free to bump up the priority if there are other scenarios where this might be a problem.