Last modified: 2014-04-23 05:59:59 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T64360, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 62360 - LocalFileDeleteBatch::getHashes oldimage table scan on commons is slow
LocalFileDeleteBatch::getHashes oldimage table scan on commons is slow
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
File management (Other open bugs)
1.23.0
All All
: High major (vote)
: ---
Assigned To: Aaron Schulz
: performance
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-03-07 06:28 UTC by Sean Pringle
Modified: 2014-04-23 05:59 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Sean Pringle 2014-03-07 06:28:45 UTC
Spotted this on commonswiki master:

SELECT /* LocalFileDeleteBatch::getHashes */ oi_archive_name, oi_sha1 FROM `oldimage` WHERE oi_archive_name IN ('20140306203724!Copa_América_Argentina_2011_(logo).png', '20140306204156!Copa_América_Argentina_2011_(logo).png')

Commons oldimage has millions of rows, and this does a table scan because oi_archive_name is only indexed in second position:

KEY `oi_name_archive_name` (`oi_name`,`oi_archive_name`(14))

This transaction was also holding locks on `page` table presumably from an earlier query. Those locks caused a number of other file-upload transaction to exceed innodb_lock_wait_timeout.

tendril> select * from innodb_locks_log where lock_trx_id = 'AB64BF626'\G
*************************** 1. row ***************************
      stamp: 2014-03-07 05:51:47
  server_id: 1060
    lock_id: AB64BF626:0:34662177:177
lock_trx_id: AB64BF626
  lock_mode: X
  lock_type: RECORD
 lock_table: `commonswiki`.`page`
 lock_index: `name_title`
 lock_space: 0
  lock_page: 34662177
   lock_rec: 177
  lock_data: 6, 0x436F70615F416DC3A9726963615F417267656E74696E615F323031315F286C6F676F292E706E67
... more similar locks ...
16 rows in set (0.35 sec)

Therefore:

1. oi_archive_name deserves an index, or can oi_name be included in the query?

2. oldimage deserves a primary key to make online schema changes possible.

3. Can the query go to a slave?
Comment 1 Gerrit Notification Bot 2014-04-22 05:59:06 UTC
Change 127871 had a related patch set uploaded by Aaron Schulz:
Fixed slow query in LocalFileDeleteBatch::getHashes()

https://gerrit.wikimedia.org/r/127871
Comment 2 Gerrit Notification Bot 2014-04-22 06:34:15 UTC
Change 127871 merged by Springle:
Fixed slow query in LocalFileDeleteBatch::getHashes()

https://gerrit.wikimedia.org/r/127871
Comment 3 Gerrit Notification Bot 2014-04-22 16:19:48 UTC
Change 128943 had a related patch set uploaded by Aaron Schulz:
Fixed slow query in LocalFileDeleteBatch::getHashes()

https://gerrit.wikimedia.org/r/128943
Comment 4 Gerrit Notification Bot 2014-04-22 16:23:43 UTC
Change 128943 merged by jenkins-bot:
Fixed slow query in LocalFileDeleteBatch::getHashes()

https://gerrit.wikimedia.org/r/128943
Comment 5 Andre Klapper 2014-04-23 05:59:59 UTC
Closing again - all merged.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links