Last modified: 2014-07-15 19:49:27 UTC
From https://wikitech.wikimedia.org/wiki/Incident_documentation/20140619-parsercache 3. Mediawiki PHP may need some better way of handling a DB host that is flaky rather than completely down. Historically we've seen similar lock-up behavior on S[1-7] where one slave having problems leads to unnecessary outages. As it happens this week we discussed options for DB proxies (haproxy probably) in #mediawiki_security, both for HA and maintenance reasons. It's possible that PHP simply should not be connecting directly to databases without hand-holding. This need to take Mediawiki LB and query groups into account. May even need heartbeat and STONITH?