Last modified: 2013-04-22 14:51:46 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T49437, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 47437 - ResourceLoader: Implement support for enhanced minification (e.g. support UglifyJS)
ResourceLoader: Implement support for enhanced minification (e.g. support Ugl...
Status: NEW
Product: MediaWiki
Classification: Unclassified
ResourceLoader (Other open bugs)
1.22.0
All All
: Low enhancement (vote)
: Future release
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-19 23:31 UTC by Krinkle
Modified: 2013-04-22 14:51 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Krinkle 2013-04-19 23:31:26 UTC
Right now we use a very basic but fast minifier. It has to perform very well due to the way we do on-demand package generation[1] whilst having a very high cache hit ratio.

Though this is nice, it drastically limits our options and ability to implement additional features.

Three features in particular:

* Implementing source maps[2] for easier debugging. At the moment with our basic minification enabling "Prettification" in Chrome Dev Tools makes the debugging experience "Okay" to deal with, but it is still all squashed into one file (doesn't map to original file names). When we do even more advanced minification this becomes even more important.

* Conditional code / stripping blocks. One of the things more sophisticated minifiers are capable of is stripping dead code. Aside from the obvious rare case of consistently unreachable code (which should just be wiped from the code base), this is useful for debugging purposes. See also bug 37763. Right now we have very few mw.log calls. I believe we avoid these because they take up space. Though they are a no-op in production mode (the log method is an empty function by default, in debug mode we load the actual module that populates the method. So it isn't that they would pollute the console in production, but that they take up javascript code. By putting them in something like `if (MW_DEBUG) { mw.log(...); }` we can have them be stripped by UglifyJS in production and preserve them in debug mode by predefining a global constant MW_DEBUG set to true or false respectively in UglifyJS.

* Better minification: variable name changes, optimising for gzip, optimising statement to be shorter notation etc. [3]

So that's all great, but the problem is that, though UglifyJS[4] (for example) is getting faster, it is still much too slow to run on many files at once on-demand from the web server.

Last February when I was in San Francisco, Roan and I have been thinking about something. I recall the following, though Roan might have a better version of this:

* We'd run the quick minifier on cache miss to populate the cache quickly and respond to the request. Then enqueue a job to run the advanced minifier (asynchronously).
* The job queue will then run the elaborate minification process and replace the cache item. We don't have to worry about the possibility of overwriting a new version with a new version because the cache keys contain a hash of the raw contents, so worst case scenario we're saving something that won't be used.

There's 2 details in particular I'm not sure about:
* How do we deliver them to the client? We have unique urls with version timestamps.
- The only way to trigger a purge is to either keep track of all urls in varnish that contain the module name and order a purge in varnish (after we update memcached, of course, so it'd be a quick roundtrip to Apache to compose a response from cached components)
- Or alternatively, cause a version bump in the module (touch() the files)

* The job queue, we can enqueue generic jobs that check everything. Or enqueue a job per cache item. In either case we need to account for the case that the enqueued job is no longer needed by the time it runs (in case we use generic jobs, once the first one runs, it should cancel any other ones in the queue, in case of module or item specific jobs cancel any for the same).

And then there is the question of how getting the javascript code and nodejs deployed and execute it from php. Installing nodejs on every apache and shelling out is probably not a good idea. Alternatively we could wrap it in a priviate service (like Parsoid), so we set up a few of them in the bits cluster and PHP would open a socket or HTTP request and POST or stream input and get output back.


[1] https://www.mediawiki.org/wiki/ResourceLoader/Features#On-demand_package_generation
[2] http://www.html5rocks.com/en/tutorials/developertools/sourcemaps/ https://github.com/mozilla/source-map http://www.youtube.com/watch?v=HijZNR6kc9A
[3] https://github.com/mishoo/UglifyJS2#compressor-options
[4] https://github.com/mishoo/UglifyJS2
Comment 1 db [inactive,noenotif] 2013-04-20 11:44:50 UTC
For your first point, you can append debug=true to each url you want to debug or for an own wiki, you can set $wgResourceLoaderDebug to enable default debug=true, than the minifier is not processed and each javascript is shipped in its own file.

For the other point, you wrote on other bugs

 Marking bugs that suggest altering the token stream in JavaScriptMinifier as
wontfix.

This enhancement sounds also like altering the token stream with a different (hopfully optional) technic.
Comment 2 Krinkle 2013-04-22 14:51:46 UTC
This change is in harmony of the design and requirements in ResourceLoader where any compression and packaging is only for improvements. Everything needs to have a raw mode still and raw mode should not introduce problems.

The default minifier will be used still. And for some environments (those that can't install things on the server) will likely stick to just the default minifier.

The extra minification is an optional enhancement. Optional isn't the right description in my opinion as it won't use it instead of the default minifier, but on top of the default minifier. Though it won't take the default minifier's output as input, it will run alongside from an asynchronous job queue. So the first cache miss will be responded to with the (current) basic minifier.

The extra features regarding conditional code (if MW_DEBUG: mw.log) degrades gracefully as it is still valid javascript and the MW_DEBUG variable will simply be set to false. and mw.log already has a no-op dummy in non-debug mode.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links