Last modified: 2014-09-26 10:14:09 UTC
The API runs the following query: SELECT /* ApiQueryCategoryMembers::run */ cl_from,cl_sortkey,cl_type,page_namespace,page_title,cl_timestamp FROM page,categorylinks FORCE INDEX (cl_timestamp) WHERE cl_to = 'Copy_to_Wikimedia_Commons_(bot-assessed)' AND (cl_from=page_id) ORDER BY cl_timestamp,cl_from LIMIT 501; CREATE TABLE categorylinks ( cl_from int(8) unsigned NOT NULL DEFAULT '0', cl_to varbinary(255) NOT NULL DEFAULT '', cl_sortkey varbinary(230) NOT NULL DEFAULT '', cl_timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, cl_sortkey_prefix varbinary(255) NOT NULL DEFAULT '', cl_collation varbinary(32) NOT NULL DEFAULT '', cl_type enum('page','subcat','file') NOT NULL DEFAULT 'page', UNIQUE KEY cl_from (cl_from,cl_to), KEY cl_timestamp (cl_to,cl_timestamp), KEY cl_collation (cl_collation), KEY cl_sortkey (cl_to,cl_type,cl_sortkey,cl_from) ) ENGINE=InnoDB DEFAULT CHARSET=binary The forced cl_timestamp index is expected to utilize InnoDB extended indexes (where secondary indexes also store the clustered primary key fields) to avoid a filesort. Enwiki slaves up to MariaDB 5.5.34 do this correctly. *************************** 1. row *************************** id: 1 select_type: SIMPLE table: categorylinks type: ref possible_keys: cl_timestamp key: cl_timestamp key_len: 257 ref: const rows: 552330 Extra: Using index condition; Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: page type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: enwiki.categorylinks.cl_from rows: 1 Extra: 2 rows in set (0.38 sec) Slaves running 5.5.36+ revert to using a filesort despite apparently having identical categorylinks tables. Handler% stats suport the observation. *************************** 1. row *************************** id: 1 select_type: SIMPLE table: categorylinks type: ref possible_keys: cl_from,cl_timestamp,cl_sortkey key: cl_sortkey key_len: 257 ref: const rows: 466736 Extra: Using index condition; Using where; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: page type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: enwiki.categorylinks.cl_from rows: 1 Extra: 2 rows in set (0.32 sec) Important to note that the first enwiki 5.5.36 slave, from which others were cloned, did have data dumped and reloaded rather than being upgraded in place, so categorylinks was definitely recreated. Possibly something has changed with how the first UNIQUE index can be chosen as primary key, and the 6-byte rowid is in use instead? Simply rebuilding the table has no effect: ALTER TABLE categorylinks ENGINE=InnoDB; But making the primary key explicit restores the expected behavior: ALTER TABLE categorylinks DROP INDEX cl_from, ADD PRIMARY KEY (cl_from, cl_to); Need to investigate what caused the change in behavior, and whether this affects any other mediawiki tables.
Wrong EXPLAIN output showing the filesort on 5.5.36+ slaves. Correct one: *************************** 1. row *************************** id: 1 select_type: SIMPLE table: categorylinks type: ref possible_keys: cl_timestamp key: cl_timestamp key_len: 257 ref: const rows: 551942 Extra: Using index condition; Using where; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: page type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: enwiki.categorylinks.cl_from rows: 1 Extra: 2 rows in set (0.30 sec)
Seeking upstream input: https://mariadb.atlassian.net/browse/MDEV-6794