Last modified: 2014-09-30 16:05:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T73465, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 71465 - DSR information from stripped tag is lost when it is a direct child of <body>
DSR information from stripped tag is lost when it is a direct child of <body>
Status: REOPENED
Product: Parsoid
Classification: Unclassified
serializer (Other open bugs)
unspecified
All All
: Low normal
: ---
Assigned To: Parsoid Team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-09-30 14:22 UTC by Jackmcbarn
Modified: 2014-09-30 16:05 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Jackmcbarn 2014-09-30 14:22:36 UTC
In the following wikitext, a Parsoid round-trip results in the closing blockquote tag being lost:

: <blockquote>
foo
: </blockquote>

I haven't seen anything like this happen prior to the resolution of bug #64901, so I wonder if it may be a regression.
Comment 1 ssastry 2014-09-30 15:40:14 UTC
No, it is not a regression. It doesn't roundtrip in edit mode. See output (I've removed a serializer warning to remove clutter). But, selser will preserve it in most cases except probably where the line containing the end blocktag is edited.

----------
[subbu@earth lib] echo ":<blockquote>\nfoo\n:</blockquote>" | node parse --wt2wt --rtTestMode true
:<blockquote>
foo
:</blockquote>
[subbu@earth lib] echo ":<blockquote>\nfoo\n:</blockquote>" | node parse --wt2wt
:<blockquote>
foo
:
----------

In this case, the wikitext is badly nested and the opening and closing blockquote tags are considered nested in separate dl-dt lists and the treebuilder will close the opening tag automatically within the first list and strip the closing tab from the second list. Parsoid recovers this information about fixups from the DOM and adds fixup information. In normal wt2wt mode, that information is used since the DOM could have been edited since (to fix the errors, for ex.).

So, the behavior is as expected.
Comment 2 Jackmcbarn 2014-09-30 15:45:30 UTC
(In reply to ssastry from comment #1)
> (I've removed a serializer warning to remove clutter). But, selser will
> preserve it in most cases except probably where the line containing the end
> blocktag is edited.

In that case, reopening as an issue with selser, because it's not handling this right then. Here's the edit where this was actually a problem:
https://en.wikipedia.org/w/index.php?title=Talk:Neil_deGrasse_Tyson&diff=627662324&oldid=627660338

The blockquote that was removed was 1000 lines away from the actual changes.
Comment 3 ssastry 2014-09-30 16:05:36 UTC
$ echo ":<blockquote>\na\n:</blockquote>\nb" | node parse --rtTestMode true
...
<dl data-parsoid='{"dsr":[0,13,0,0]}'><dd data-parsoid='{"dsr":[0,13,1,0]}'><blockquote data-parsoid='{"stx":"html","autoInsertedEnd":true,"dsr":[1,13,12,0]}'></blockquote></dd></dl>
<p data-parsoid='{"dsr":[14,15,0,0]}'>a</p>
<dl data-parsoid='{"dsr":[16,17,0,0]}'><dd data-parsoid='{"dsr":[16,17,1,0]}'></dd></dl><meta typeof="mw:Placeholder/StrippedTag" data-parsoid='{"src":"&lt;/blockquote>","name":"BLOCKQUOTE","dsr":[17,30,null,null]}'/>
<p data-parsoid='{"dsr":[31,32,0,0]}'>b</p>
...

So, the dsr info from the stripped tag is lost. Perhaps, we should wrap these stripped tags in a placeholder span so that it carries the dsr information in edit mode. In an editor, it will show up as an empty span (and not sure what the editing implications are for this solution). But for this editing issue that needs resolution, this should be a straightforward fix in the markTreeBuilderFixups pass.

I am marking this low priority since this is not a big issue as far as I can tell. Feel free to bump up the priority if there are other scenarios where this might be a problem.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links