Last modified: 2014-05-23 22:23:40 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T56946, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 54946 - Unhandled <pre> tokenizing scenarios in tokenizer
Unhandled <pre> tokenizing scenarios in tokenizer
Status: RESOLVED FIXED
Product: Parsoid
Classification: Unclassified
tokenizer (Other open bugs)
unspecified
All All
: High normal
: ---
Assigned To: Gabriel Wicke
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-03 23:06 UTC by ssastry
Modified: 2014-05-23 22:23 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description ssastry 2013-10-03 23:06:42 UTC
See test case below. For some reason the <pre> inside the <blockquote> and the p-tag before the blockquote (all of those conditions are necessary to reproduce the bug) is causing the content after the blockquote to not be wrapped in p-tags. See output below. Probably some edge case in the paragraph-wrapping code. 

Based on bug report here:
https://en.wikipedia.org/w/index.php?title=Wikipedia:VisualEditor/Feedback&oldid=575631753#VE_removing_paragraph_gaps

[subbu@earth lib] cat /tmp/x
a

<blockquote><pre>
b
</pre></blockquote>

c

d
[subbu@earth lib] node parse < /tmp/x
<body data-parsoid='{"dsr":[0,49,0,0]}'><p data-parsoid='{"dsr":[0,1,0,0]}'>a</p>

<blockquote data-parsoid='{"stx":"html","dsr":[3,42,12,13]}'><pre data-parsoid='{"stx":"html","autoInsertedEnd":true,"strippedNL":"\n","dsr":[15,29,5,0]}'>

b
&lt;/pre&gt;</pre></blockquote>

c

d
</body>
Comment 1 ssastry 2013-10-03 23:10:45 UTC
Changing that to "<blockquote><pre> b </pre></blockquote>" does not trigger the bug. So, p-tags before blockquote and HTML-pre in blockquote with content on new line puts the p-wrapper in a state where p-tags are not added.
Comment 2 ssastry 2013-10-04 16:34:59 UTC
This is actually a tokenizer bug. The closing </pre> is not being recognized as an end-tag when a HTML <pre> follows another literal HTML tag on the same line.

Relevent snippet of output for: "a\n\n<span><pre>\nb\n</pre></span>"
...
<span data-parsoid='{"stx":"html","dsr":[3,28,5,6]}'><pre data-parsoid='{"stx":"html","autoInsertedEnd":true,"strippedNL":"\n","dsr":[8,22,5,0]}'>

b
&lt;/pre&gt;</pre></span>
...
Comment 3 Gerrit Notification Bot 2013-10-04 22:00:24 UTC
Change 87632 had a related patch set uploaded by Subramanya Sastry:
(Bug 54946) Fix unhandle <pre> tokenizing scenarios

https://gerrit.wikimedia.org/r/87632
Comment 4 Gerrit Notification Bot 2013-10-29 00:42:13 UTC
Change 92469 had a related patch set uploaded by GWicke:
WIP Bug 54946: Alternative solution for <pre> tokenization

https://gerrit.wikimedia.org/r/92469
Comment 5 Gerrit Notification Bot 2013-11-01 00:34:40 UTC
Change 92469 merged by jenkins-bot:
Bug 54946: Alternative solution for <pre> tokenization

https://gerrit.wikimedia.org/r/92469
Comment 6 Gerrit Notification Bot 2014-05-23 22:23:40 UTC
Change 87632 abandoned by Subramanya Sastry:
(Bug 54946) Fixed unhandled <pre> tokenizing scenarios

Reason:
Old and rusty and I am not going to look at this as I originally thought.

https://gerrit.wikimedia.org/r/87632

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links