Last modified: 2012-02-02 14:14:29 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T36156, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 34156 - SiteStatsInit::refresh() triggered inappropriately, caused downtime
SiteStatsInit::refresh() triggered inappropriately, caused downtime
Status: NEW
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
1.18.x
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-02-02 11:06 UTC by Tim Starling
Modified: 2012-02-02 14:14 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Tim Starling 2012-02-02 11:06:21 UTC
On pl.wikipedia.org from 05:58:45 onwards, SiteStatsInit::refresh() began to be called several times per second. It's not known at this stage why SiteStats::isSane() returned false. 

The binlog shows that the refresh() queries were often executed in autocommit mode, meaning that the DELETE query was committed before the INSERT query began. This would have caused isSane() to return false until the new row insert was committed, leading to a flood of attempted refreshes.

Eventually, a flood of SELECT COUNT(*) queries at around 07:10 caused an overload on all s2 slaves, leading to an overload of the apache pool and site-wide downtime. SiteStatsInit was disabled and all related queries were killed. When the dust settled, the site_stats row was missing, and had to be recovered from binlogs.

I suggest removing the isSane() checks from loadAndLazyInit(), and doing a refresh only from maintenance scripts or web-based upgrade. SiteStats::load() should be able to tolerate a missing site_stats row, and the accessor functions should return false without giving a PHP warning. Additionally, the refresh should be done with REPLACE instead of DELETE and INSERT.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links