Last modified: 2014-02-12 23:35:37 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T47584, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 45584 - WikiExporter is causing replication issues
WikiExporter is causing replication issues
Status: NEW
Product: MediaWiki
Classification: Unclassified
Export/Import (Other open bugs)
1.21.x
All All
: High major (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-28 21:11 UTC by Asher Feldman
Modified: 2014-02-12 23:35 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Asher Feldman 2013-02-28 21:11:51 UTC
WikiExporter (as called via the dump code on snapshot2) is selecting the entire revision table joined with page in a single query: 

SELECT /* WikiExporter::dumpFrom 127.0.0.1 */  /*! STRAIGHT_JOIN */ *  FROM `page` FORCE INDEX (PRIMARY) INNER JOIN `revision` ON ((page_id=rev_page))   ORDER BY page_id ASC

This results in very long running transactions that have been impacting replication.
Comment 1 Sam Reed (reedy) 2013-02-28 22:04:53 UTC
The fix is to batch up these queries so it's not selecting the whole table.

The "difficult" part is what condition to continue from, due to the code having many paths...
Comment 2 Brion Vibber 2013-02-28 23:45:24 UTC
Separating this query into batches means you'll get an inconsistent dump, unless you use a transaction to keep the read state frozen... in which case you're probably not gaining anything.

Of course if nobody cares about consistency, we could possibly ignore new revisions after the beginning and it'd be "mostly consistent" except for things that get deleted during the dump might fully or partially disappear instead of staying in the snapshot.

Probably still needs to read the whole 'page' table in a consistent transaction though.
Comment 3 Ariel T. Glenn 2013-06-03 13:59:24 UTC
It's already the case that the stub dump may contain revisions that are deleted by the time the page content dumps roll around.  In other cases we wind up missing category links for pages because we're in between the delete and the insert when the rows for that page are updated (see LinksUpdate::incrTableUpdate).  I'd like us to move towards more consistency than less though.  Any thoughts about how we might get there?
Comment 4 Sean Pringle 2013-08-29 03:27:07 UTC
I'm not yet familiar with WikiExporter code, so apologies if any of this is way off base:

Firstly, in the short term could it simply be done using dbXX Tampa slaves only?  Long-running queries still hurt there, but less so.

For more consistency we could perhaps use one depooled Tampa slave per cluster. STOP SLAVE, wait for SQL thread, dump data to taste, START SLAVE. Binlogs hang around for 60 days so the process can be slow.
Comment 5 Sean Pringle 2013-08-29 03:28:31 UTC
Er, sorry, that should have been: STOP SLAVE IO thread, wait for SQL thread...
Comment 6 Ariel T. Glenn 2013-08-29 07:28:45 UTC
Not tampa slaves, as this production service has now moved to eqiad (September is the deadline for all production services, iirc).

The longest running dump is a toss-up between en wp and de wp, 12 days or so for en wp and I don't know yet for de wiki as it's set up now.  I'm definitely interested in looking into this possibility.  Dumps are run automatically on a rolling basis so I'm not sure how we'd handle depooling slaves unless there were one dedicated from each cluster dedicated for this purpose, except for en wp where it would be depooled only the two weeks we run.  Needs some thought...

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links