Last modified: 2014-09-26 10:14:09 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T72558, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 70558 - categorylinks InnoDB extended indexes behavior differs between enwiki slaves
categorylinks InnoDB extended indexes behavior differs between enwiki slaves
Status: NEW
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
1.24rc
All All
: High normal (vote)
: ---
Assigned To: Sean Pringle
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-09-08 13:55 UTC by Sean Pringle
Modified: 2014-09-26 10:14 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Sean Pringle 2014-09-08 13:55:57 UTC
The API runs the following query:

SELECT /* ApiQueryCategoryMembers::run */  cl_from,cl_sortkey,cl_type,page_namespace,page_title,cl_timestamp  FROM page,categorylinks  FORCE INDEX (cl_timestamp) WHERE cl_to = 'Copy_to_Wikimedia_Commons_(bot-assessed)' AND (cl_from=page_id)  ORDER BY cl_timestamp,cl_from LIMIT 501;

CREATE TABLE categorylinks (
  cl_from int(8) unsigned NOT NULL DEFAULT '0',
  cl_to varbinary(255) NOT NULL DEFAULT '',
  cl_sortkey varbinary(230) NOT NULL DEFAULT '',
  cl_timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  cl_sortkey_prefix varbinary(255) NOT NULL DEFAULT '',
  cl_collation varbinary(32) NOT NULL DEFAULT '',
  cl_type enum('page','subcat','file') NOT NULL DEFAULT 'page',
  UNIQUE KEY cl_from (cl_from,cl_to),
  KEY cl_timestamp (cl_to,cl_timestamp),
  KEY cl_collation (cl_collation),
  KEY cl_sortkey (cl_to,cl_type,cl_sortkey,cl_from)
) ENGINE=InnoDB DEFAULT CHARSET=binary

The forced cl_timestamp index is expected to utilize InnoDB extended indexes (where secondary indexes also store the clustered primary key fields) to avoid a filesort. Enwiki slaves up to MariaDB 5.5.34 do this correctly.

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: categorylinks
         type: ref
possible_keys: cl_timestamp
          key: cl_timestamp
      key_len: 257
          ref: const
         rows: 552330
        Extra: Using index condition; Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: page
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: enwiki.categorylinks.cl_from
         rows: 1
        Extra: 
2 rows in set (0.38 sec)

Slaves running 5.5.36+ revert to using a filesort despite apparently having identical categorylinks tables. Handler% stats suport the observation.

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: categorylinks
         type: ref
possible_keys: cl_from,cl_timestamp,cl_sortkey
          key: cl_sortkey
      key_len: 257
          ref: const
         rows: 466736
        Extra: Using index condition; Using where; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: page
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: enwiki.categorylinks.cl_from
         rows: 1
        Extra: 
2 rows in set (0.32 sec)

Important to note that the first enwiki 5.5.36 slave, from which others were cloned, did have data dumped and reloaded rather than being upgraded in place, so categorylinks was definitely recreated. Possibly something has changed with how the first UNIQUE index can be chosen as primary key, and the 6-byte rowid is in use instead?

Simply rebuilding the table has no effect:

ALTER TABLE categorylinks ENGINE=InnoDB;

But making the primary key explicit restores the expected behavior:

ALTER TABLE categorylinks DROP INDEX cl_from, ADD PRIMARY KEY (cl_from, cl_to);

Need to investigate what caused the change in behavior, and whether this affects any other mediawiki tables.
Comment 1 Sean Pringle 2014-09-08 13:59:59 UTC
Wrong EXPLAIN output showing the filesort on 5.5.36+ slaves. Correct one:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: categorylinks
         type: ref
possible_keys: cl_timestamp
          key: cl_timestamp
      key_len: 257
          ref: const
         rows: 551942
        Extra: Using index condition; Using where; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: page
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: enwiki.categorylinks.cl_from
         rows: 1
        Extra: 
2 rows in set (0.30 sec)
Comment 2 Sean Pringle 2014-09-26 10:14:09 UTC
Seeking upstream input: https://mariadb.atlassian.net/browse/MDEV-6794

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links