Last modified: 2013-12-05 19:31:06 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T60043, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 58043 - PHP eats vertical pipes inside tags, Parsoid doesn't
PHP eats vertical pipes inside tags, Parsoid doesn't
Status: NEW
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Gabriel Wicke
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-12-05 18:17 UTC by C. Scott Ananian
Modified: 2013-12-05 19:31 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description C. Scott Ananian 2013-12-05 18:17:37 UTC
Parsoid:

echo "{{گفتاورد بزرگ}}" | tests/parse.js --prefix=fawiki

emits:

<body data-parsoid='{"dsr":[0,17,0,0]}'><p about="#mwt1" typeof="mw:Transclusion" data-mw='{"parts":[{"template":{"target":{"wt":"گفتاورد بزرگ","href":"./الگو:گفتاورد_بزرگ"},"params":{},"i":0}}]}' data-parsoid='{"dsr":[0,16,null,null],"pi":[[]]}'>&lt;blockquote|>
</p>
</body>

The &lt;blockquote|> business is totally bogus.  It's not present in the output of the PHP parser as far as I can tell.

This is an example from https://fa.wikipedia.org/wiki/%D8%A2%D8%B1%D8%A7%D9%85%DA%AF%D8%A7%D9%87_%DA%A9%D9%88%D8%B1%D9%88%D8%B4_%D8%A8%D8%B2%D8%B1%DA%AF#cite_ref-22
and you can see the bogus 'blockquote' stuff in

http://parsoid-lb.eqiad.wikimedia.org/fawiki/%D8%A2%D8%B1%D8%A7%D9%85%DA%AF%D8%A7%D9%87_%DA%A9%D9%88%D8%B1%D9%88%D8%B4_%D8%A8%D8%B2%D8%B1%DA%AF?oldid=11247123#cite_ref-22-0
Comment 1 C. Scott Ananian 2013-12-05 18:39:02 UTC
$ echo '<blockquote|>a</blockquote>' | php maintenance/parse.php 
<blockquote>a</blockquote>
$ echo '<blockquote|>a</blockquote>' | tests/parse.js
<body data-parsoid='{"dsr":[0,28,0,0]}'>&lt;blockquote|>a
</body>
Comment 2 Gabriel Wicke 2013-12-05 19:31:06 UTC
We could try to make our generic_newline_attribute production slightly more tolerant of broken wikitext like this. If we can achieve this with limited effort and without breaking the parsing of other content then it would be a good thing to do.

It would also not hurt to fix up the wikitext, possibly with help from our end (bug 46705).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links