Last modified: 2006-02-22 21:24:24 UTC
Along with the old compression work I've been doing this weekend I made the following index changes on the small wikis to see how effective they are. cur: alter table cur add index qry_checktouched (cur_id, cur_is_redirect, cur_namespace, cur_title, cur_touched); This is potentially of use for these queries, at least: /* Article::checkTouched */ SELECT cur_touched,cur_is_redirect FROM `cur` WHERE cur_id='282905' LIMIT 1 /* Title::getLinksTo */ SELECT cur_namespace,cur_title,cur_id FROM `cur`,`links` WHERE l_from=cur_id AND l_to=72 /* wfShowIndirectLinks */ SELECT cur_id,cur_namespace,cur_title,cur_is_redirect FROM `links`,`cur` WHERE l_to=360618 AND l_from=cur_id LIMIT 500 /* LinkCache::preFill */ SELECT cur_id,cur_namespace,cur_title FROM `cur`,`links` WHERE cur_id=l_to AND l_from=315312 FOR UPDATE The primary cause for this is to try to get checkTouched using this index instead of loading cur records for unchanged cur pages whch don't realy need to be loaded. The queries are individually fastbutit's adding constant backgroundload to the master and reducing that load is good. Could also usefully free up cache RAM for other things. Title::getLinksTo and wfShowIndirectLinks sometimes show up as quite slow also. LinkCache::preFill can often be slow. None of those really needs the full cur record so it's nice to dodge loading with this covering index, if possible. The changed schema for cur in 1.5 will significantly reduce the potential benefit of this index and in 1.5 it may be best not to have this index and have the cur records loaded instead, because their total size is much smaller in 1.5 and caching most or all is practical. old: alter table old engine=innodb, drop index user_timestamp, add index user_timestamp (old_user, inverse_timestamp, old_namespace); This is for: EXPLAIN /* wfSpecialContributions */ SELECT old_namespace,old_title,old_timestamp,old_comment,old_minor_e dit,old_user_text,old_id FROM `old` USE INDEX (user_timestamp) WHERE old_user =22105 AND old_namespace = 0 ORDER BY inverse_timestamp LIMIT 51 ; The addition of the namespace lets the query avoid reading the unnecessary namespaces when only one is desired. That can significantly reduce thenumber of old records/ disk seeks required when only one namespace is needed. For 1.5 or 1.6 we should be looking for a way to have a covering index for this query (one with all fields it uses). It's impossible to have the physical data organisation match both article history and user contributions without one requiring lots of seeks. The covering index will make user contributions fast, at the cost of some disk space. The tinyblob comments field is one obstacle here - can't index on it all, only a subset. It may be best to have a varchar comment and tinyblob long comment with a more link whenever the comment is too long for a varchar. Note that 1000 or so is the maximum index length for InnoDB. It's too soon for either of these to be in MediaWiki and neither may be appropriate for 1.5 and its changed schema, so these are local tuning only for now.
I guess this is a non-issue now then? :)