Last modified: 2012-09-01 16:29:38 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T41700, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 39700 - A File: link to a non-existing image can inject html
A File: link to a non-existing image can inject html
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.19.1
All All
: Unprioritized major (vote)
: 1.19.x release
Assigned To: Nobody - You can work on this!
:
: 39762 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-27 20:54 UTC by Chris Steipp
Modified: 2012-09-01 16:29 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
run htmlspecialchars exactly once on $html in makeBrokenImageLinkObj (1.42 KB, patch)
2012-08-28 22:38 UTC, Chris Steipp
Details
Chris's patch plus some cleanup and tests (3.59 KB, patch)
2012-08-29 04:01 UTC, Tim Starling
Details
Backport to 1.19, which didn't call wfCgiToArray (3.38 KB, patch)
2012-08-29 21:57 UTC, Chris Steipp
Details
Backport to 1.18 (3.33 KB, patch)
2012-08-29 22:13 UTC, Chris Steipp
Details

Description Chris Steipp 2012-08-27 20:54:38 UTC
When a [[File:]] tag to an non-existent image is rendered, the comment is written into the dom without html escaping.

Reported by Writ Keeper:


Hi all,

I'm not sure if this is the right place to report this, but I figured
better safe than sorry.  I marked a page for deletion today, called '
Atelje "Meander" ', that had an embedded Youtube video in it (not
through File Upload or Commons), and I didn't think that this was
supposed to be possible.  It did this by putting an iframe tag in the
argument to a nonexistent file (File:Nikola Novaković na kontrabasu
Miše Blama), so that it read [[File:Nikola Novaković na kontrabasu
Miše Blama|<iframe src=(url) ...</iframe>]].  The page has been
deleted, but I reproduced the behavior (using an <img> tag instead of
an iframe) on my test account's sandbox, located here:
http://en.wikipedia.org/wiki/User:WK-test/sandbox .

Thanks,
Writ Keeper
Comment 1 Tim Starling 2012-08-27 22:30:30 UTC
I've confirmed that the bug is present at least as far back as 1.17. Sanitizer::stripAllTags() is documented as returning unsafe HTML since MW 1.10, due to calling self::decodeCharReferences(). It is called by Parser::stripAltText(), and the result is passed through to Linker::makeBrokenImageLinkObj() without further processing. Before r82843, the main case would not have been vulnerable, but the linkKnown() case and the <!--ERROR--> case would have been vulnerable before that time.
Comment 2 Chris Steipp 2012-08-28 01:27:55 UTC
So looking at the best fix for this, the comments for Linker::makeBrokenImageLinkObj say that they expect the $html string to be htmlescaped. That's the closest place to the output where that expectation seems to be documented, so probably should escape $fp['title'] just before passing it in makeImageLink?

The comments in Parser::stripAltText seems (to me) to indicate the author thought Sanitizer::stripAllTags would strip out html tags, so it would be good to find some way of making that more obvious.
Comment 3 Brion Vibber 2012-08-28 03:28:34 UTC
Sanitizer::stripAllTags does strip out HTML tags -- but it returns *PLAINTEXT* which may include the characters "<" and ">", which might be around things like "script" or "iframe".

Every time you treat plaintext as HTML, you create an HTML injection vector... Escape your output, kids. :)
Comment 4 Tim Starling 2012-08-28 07:03:13 UTC
(In reply to comment #2)
> So looking at the best fix for this, the comments for
> Linker::makeBrokenImageLinkObj say that they expect the $html string to be
> htmlescaped. That's the closest place to the output where that expectation
> seems to be documented, so probably should escape $fp['title'] just before
> passing it in makeImageLink?

Something like that. My main concern is to avoid reopening bug 27679 and to avoid breaking the test cases fixed in r68491. Most link texts are wikitext, and in fact we have a parser test enforcing that interpretation:

[[File:Denys Savchenko ''Pentecoste''.jpg]]

Goes to:

<p><a href="/index.php?title=Special:Upload&amp;wpDestFile=Denys_Savchenko_%27%27Pentecoste%27%27.jpg" class="new" title="File:Denys Savchenko &#39;&#39;Pentecoste&#39;&#39;.jpg">File:Denys Savchenko <i>Pentecoste</i>.jpg</a>

Note the actual italics in the link text. So it may not be as simple as just adding more escaping.

(In reply to comment #3)
> Every time you treat plaintext as HTML, you create an HTML injection vector...
> Escape your output, kids. :)

Thanks for those words of wisdom, old man.
Comment 5 Chris Steipp 2012-08-28 22:37:45 UTC
As Tim mentioned, it looks like r82843 (which closed bug 27679) may have contributed to the problem. It prevented running htmlspecialchars twice when $html == '' (Line 884), but doesn't run htmlspecialchars on the $html variable if it is passed into the function (and htmlspecialchars wasn't run in 2 other cases as well). I made one patch to run htmlspecialchars exactly once on $html for each of the code paths.

I also tried running tooltext in Parser::stripAltText through Sanitizer::removeHTMLtags, since the comment for makeBrokenImageLinkObj says that $html should already be escaped, and I'm not entirely sure if there was a reason for it. Both fixes produced the same output for all of my test input.

Both fixes also now allows a link like [[File:AB2.gif | A<B ]] to render as <a href="/wiki/index.php?title=Special:Upload&amp;wpDestFile=AB2.gif" class="new" title="File:AB2.gif">A&lt;B</a> instead of <a href="/wiki/Wikipedia:Upload?wpDestFile=AB2.gif" class="new" title="File:AB2.gif">A</a>, which seemed to me like a bug.

Is there a preference for running htmlspecialchars vs. removeHTMLtags?
Comment 6 Chris Steipp 2012-08-28 22:38:54 UTC
Created attachment 11025 [details]
run htmlspecialchars exactly once on $html in makeBrokenImageLinkObj
Comment 7 Tim Starling 2012-08-29 00:44:55 UTC
(In reply to comment #5)
> As Tim mentioned, it looks like r82843 (which closed bug 27679) may have
> contributed to the problem. It prevented running htmlspecialchars twice when
> $html == '' (Line 884), but doesn't run htmlspecialchars on the $html variable
> if it is passed into the function (and htmlspecialchars wasn't run in 2 other
> cases as well). I made one patch to run htmlspecialchars exactly once on $html
> for each of the code paths.

Thanks for that. It appears to work, but I still want to look at it more closely.

> I also tried running tooltext in Parser::stripAltText through
> Sanitizer::removeHTMLtags, since the comment for makeBrokenImageLinkObj says
> that $html should already be escaped, and I'm not entirely sure if there was a
> reason for it. Both fixes produced the same output for all of my test input.

The comment on makeBrokenImageLinkObj() was added by someone who didn't understand the code. The only reason for that comment is that escaping happened to be missing at the time the comment was added, since the escaping was removed shortly before by another author to fix a double-escaping bug. It doesn't say anything about what callers expect or whether escaping is a good idea.

> Is there a preference for running htmlspecialchars vs. removeHTMLtags?

Sanitizer::removeHTMLtags() is a parser helper that removes certain HTML tags and leaves others for further processing by subsequent parts of the parser. It's not usually called for other purposes.

Sanitizer::stripAllTags() converts HTML (or HTML-like half-transformed wikitext) to plain text for use in an attribute. It's assumed that whatever constructs the HTML will escape the attribute correctly. Linker::makeBrokenLinkObj() is an unusual case since text which was originally intended for the alt attribute is redirected into non-attribute HTML.

There's no point in running Sanitizer::removeHTMLtags() on the input to Parser::stripAltText(), since it has already been run. And it would be incorrect to run it on the output of Parser::stripAltText(), since that would prevent users from including special characters in alt tags, regardless of what escaping method they attempted to use:

> print Sanitizer::stripAllTags('&#60;script/&#62;');
<script/>
Comment 8 Tim Starling 2012-08-29 04:01:10 UTC
Created attachment 11032 [details]
Chris's patch plus some cleanup and tests

Chris's patch looked fine. I fixed the documentation regarding $html and the variable name. I also removed $prefix and $trail, since all callers have set them to empty strings since the Age of Legends, and leaving them in was unnecessarily confusing. I also added a parser test.

In case anyone cares: link trails were originally implemented in the link functions. Now we have link trail handling in LinkHolderArray::makeHolder() and link prefix handling in Parser::replaceInternalLinks2(), but Linker::splitTrail() was left where it was, and many Linker functions still take prefix/trail arguments.
Comment 9 Tim Starling 2012-08-29 09:27:08 UTC
*** Bug 39762 has been marked as a duplicate of this bug. ***
Comment 10 Tim Starling 2012-08-29 10:28:34 UTC
Niklas blamed r61816 (MW 1.16), which seems plausible. At the time, only the linkKnown() case would have been vulnerable, so it would have only affected wikis with $wgEnableUploads = $wgUploadNavigationUrl = false. I've confirmed the vulnerability in 1.16 with those conditions.

r82843 (MW 1.18) would have extended the vulnerability to wikis with upload enabled. The commit message indicates that the author was misled by the vulnerability from r61816 into thinking that there was no need to escape $text, and his test cases didn't fully explore that assumption.

The error case was probably never reachable.

MW 1.18 is still within our 12 month support window, so we will need releases of both 1.18 and 1.19.
Comment 11 Chris Steipp 2012-08-29 21:57:40 UTC
Created attachment 11038 [details]
Backport to 1.19, which didn't call wfCgiToArray
Comment 12 Chris Steipp 2012-08-29 22:13:13 UTC
Created attachment 11039 [details]
Backport to 1.18

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links