Last modified: 2014-05-15 02:37:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T67042, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 65042 - Mobile Table of Contents double unescapes encoded characters
Mobile Table of Contents double unescapes encoded characters
Status: RESOLVED FIXED
Product: MobileFrontend
Classification: Unclassified
beta (Other open bugs)
unspecified
All All
: Immediate blocker
: ---
Assigned To: security
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-05-08 21:31 UTC by Ryan Kaldari
Modified: 2014-05-15 02:37 UTC (History)
10 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
fix for section unescaping bug (703 bytes, patch)
2014-05-08 23:18 UTC, Ryan Kaldari
Details
better patch file (777 bytes, patch)
2014-05-08 23:54 UTC, Ryan Kaldari
Details
get rid of lineText completely, fix transformSections (5.66 KB, application/octet-stream)
2014-05-12 20:18 UTC, Ryan Kaldari
Details
again as a patch (5.66 KB, patch)
2014-05-12 20:20 UTC, Ryan Kaldari
Details

Description Ryan Kaldari 2014-05-08 21:31:26 UTC
Compare these two TOCs:
http://en.wikipedia.beta.wmflabs.org/wiki/Igloo
http://en.m.wikipedia.beta.wmflabs.org/wiki/Igloo (in beta mode)

The desktop one converts & to & (which is correct):
You & your mom & your dog & your shirt

The mobile one converts & to & (which is wrong):
You & your mom & your dog & your shirt
Comment 1 Bingle 2014-05-08 21:35:14 UTC
Prioritization and scheduling of this bug is tracked on Trello card https://trello.com/c/IIXEuTOZ
Comment 2 Brion Vibber 2014-05-08 22:24:18 UTC
I think it's these lines in PageApi.js transformSections():

  // Store text version so users of API do not have to worry about styling e.g. Table of Contents
  // FIXME: [API] should probably do this for us - we want to be able to control the styling of these headings - no inline styles!
  section.lineText = $( '<div>' ).html( section.line ).text();
 ...

  $( '<h' + section.level + '>' ).attr( 'id', section.anchor ).html( section.line )

Since this seems to be already plaintext, and not HTML, performing the conversion has the nice side effect of executing any <script> contained in the text, without leaving it in the DOM to be seen.
Comment 3 Ryan Kaldari 2014-05-08 22:41:28 UTC
Hmmm, even if I change that entire each loop to:
$.each( sections, function( i, section ) {
   section.lineText = '';
   section.children = [];
   result.push( section );
} );
It still executes the section title as HTML.
Comment 4 Brion Vibber 2014-05-08 22:45:59 UTC
(In reply to Ryan Kaldari from comment #3)
> Hmmm, even if I change that entire each loop to:
> $.each( sections, function( i, section ) {
>    section.lineText = '';
>    section.children = [];
>    result.push( section );
> } );
> It still executes the section title as HTML.

Hmm, I can confirm BUT it only executes it ONCE instead of twice, so it seems to be an improvement. :D

There must be another similar one hiding around somewhere...
Comment 5 Ryan Kaldari 2014-05-08 23:02:19 UTC
If I add...
section.line = '';
...to that loop, it eliminates the last execution. Of course that breaks a few features though.
Comment 6 Ryan Kaldari 2014-05-08 23:11:34 UTC
I have it fixed locally. Will upload diff in a sec...
Comment 7 Ryan Kaldari 2014-05-08 23:18:50 UTC
Created attachment 15321 [details]
fix for section unescaping bug

patch attached
Comment 8 Ryan Kaldari 2014-05-08 23:54:08 UTC
Created attachment 15322 [details]
better patch file
Comment 9 Krinkle 2014-05-09 10:05:05 UTC
mw.sections[].line is defined in ApiMobileView:

https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/blob/a893faf37b78e6/includes/api/ApiMobileView.php#L402-L408

 $data['sections'] = $parserOutput->getSections();

Which comes from mediawiki-core Parser:

https://github.com/wikimedia/mediawiki-core/blob/b31c093855be5e0/includes/parser/Parser.php#L4469-L4490

Server-side code uses it as html (the parser claims it has been tripped and escaped where needed, it is considered html, not text).

For example, Linker::tocLine, embeds it raw:

https://github.com/wikimedia/mediawiki-core/blob/b31c093855be5e0/includes/Linker.php#L1657-L1665

 return ... '<span class="toctext">' . $tocline . '</span>' ..


Back to the MobileFrontend code, it should be used as text or as html. I think in this case it is supposed to be used as html. If that's unsafe, then we're likely vulnerable in mediawiki-core's html response as well and it should be fixed there instead.

Either way, .html( .. ).text() is a horrible hack that should not be used. If you use mw.html.escape, use it directly and only.

If you find yourself doing:
  var fooHtml = mw.html.escape(somethingText);
  $something.html(fooHtml);

Please don't. This is extra overhead in both escaping (string manipulation), and parsing (by the browser) for no good reason. Instead just pass the original value to $something.text(), which will instruct the browser to create a native string of text, bypassing the need to escape and parse as html altogether (remember, when in javascript, there is no such thing as HTML, you're interacting with the DOM directly. You wouldn't "append" a string of php syntax to a php array either, instead you'd set a property on the array or use array push, and that is safe. Same goes for text on an element, using .text() goes straight to where it should go without the need to escape or parse.
Comment 10 Ryan Kaldari 2014-05-12 20:18:51 UTC
Created attachment 15360 [details]
get rid of lineText completely, fix transformSections
Comment 11 Ryan Kaldari 2014-05-12 20:20:32 UTC
Created attachment 15361 [details]
again as a patch
Comment 12 Max Semenik 2014-05-14 23:30:12 UTC
https://gerrit.wikimedia.org/r/#/c/133382/

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links