Last modified: 2013-10-10 12:54:14 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57526, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55526 - $parser->recursiveTagParse($input) does not escape & consistently.
$parser->recursiveTagParse($input) does not escape & consistently.
Status: NEW
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.19.7
All Linux
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-09 21:21 UTC by Philipp Spitzer
Modified: 2013-10-10 12:54 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
simple tag extension demonstrating the issue. (978 bytes, application/x-httpd-php)
2013-10-09 21:21 UTC, Philipp Spitzer
Details

Description Philipp Spitzer 2013-10-09 21:21:38 UTC
Created attachment 13461 [details]
simple tag extension demonstrating the issue.

When using $parser->recursiveTagParse($input), the & character is sometimes escaped and sometimes not:

$input = 'a & b < a | b http://example.com?a&b';
$output = $parser->recursiveTagParse($input);

Here, $output gets
'a & b &lt; a | b <a rel="nofollow" class="external free" href="http://example.com?a&amp;b">http://example.com?a&amp;b">http://example.com?a&amp;b</a>'

The first & is not escaped, and the & in the URL is escaped. < and > are escaped. The source documentation of Parser::recursiveTagParse does not mention the expected behavior. However it should be either escaping the & characters or not.

[As I have to parse the output of recursiveTagParse into a DOMDocument in my extension, this behavior makes my life real hard as I cannot just escape all & before parsing]

I attached a mini-tag extension that can be used to further analyze the problem. Note that in a tag extension, all unescaped & in the returned string are converted to &amp; by MediaWiki at a later stage so that correct HTML is returned even if someone would return the output of $parser->recursiveTagParse($some_wikitext) directly. This "hides" the bug but doesn't solve the underlying problem.


Thanks for developing MediaWiki by the way. It's great!
Philipp
Comment 1 Philipp Spitzer 2013-10-09 21:26:17 UTC
(the comment function of buzilla destroyed the output line... Please take a look at the attached file - it's really short, I promise).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links