Last modified: 2011-10-12 22:12:49 UTC
I've intermittently been getting "(Cannot contact the database server: Unknown error (10.0.6.42))" errors on <https://en.wikipedia.org>. This most recent time happened when trying to preview an edit. I don't edit much, but I've gotten the error a few times over the past week. It'd be nice if someone could check the frequency of such errors and examine what the underlying issue is.
http://rt.wikimedia.org/Ticket/Display.html?id=1684
Not that useful for ordinary people. But if there's a RT ticket saying something like "db32 randomly drops connections". Should this bug be closed?
(In reply to comment #2) > Not that useful for ordinary people. But if there's a RT ticket saying > something like "db32 randomly drops connections". Should this bug be closed? Unless that RT ticket contains top-secret information, Bugzilla should always take precedence. Ops needs to get better about using RT only when absolutely necessary.
No. Definitely don't close bugs if an RT is created. We are looking for better ways to update both ways. I'd prefer we have a public way of tracking info.
Just got "(Can't contact the database server: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (localhost))" on https://en.wikipedia.org.
New reports here. CCing Tim, Asher and bumping priority. http://en.wikipedia.org/w/index.php?diff=455163422&oldid=455150842
(In reply to comment #5) > Just got "(Can't contact the database server: Can't connect to local MySQL > server through socket '/var/run/mysqld/mysqld.sock' (2) (localhost))" on > https://en.wikipedia.org. What was the URL? Was the error message inside a MediaWiki skin, or was it just a blank page with an error message? If the navigation elements were there, did they look normal, or was the site name incorrect?
(In reply to comment #7) > (In reply to comment #5) > > Just got "(Can't contact the database server: Can't connect to local MySQL > > server through socket '/var/run/mysqld/mysqld.sock' (2) (localhost))" on > > https://en.wikipedia.org. > > What was the URL? Was the error message inside a MediaWiki skin, or was it just > a blank page with an error message? If the navigation elements were there, did > they look normal, or was the site name incorrect? I ran into the same error myself yesterday, but on http, not https. I found it when clicking on an internal link to http://en.wikipedia.org/wiki/2011_Pacific_hurricane_season. No MediaWiki skin was visible, just a white page with the localhost error message and a search bar. Unfortunately, I can't seem to replicate the problem consistently in any way.
When there's a connection error, a log entry is written by LoadBalancer, not Database. If an extension is creating its own Database objects with incorrect configuration, that would explain the lack of connection errors in dberror.log.
(In reply to comment #9) > When there's a connection error, a log entry is written by LoadBalancer, not > Database. If an extension is creating its own Database objects with incorrect > configuration, that would explain the lack of connection errors in dberror.log. Actually none of that is true. Maybe an extension could make these errors somehow, but I'm not sure how.
The linked diff say "the one I'm getting has a "(Can't contact the database server: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (localhost))" on it" which would be very wrong. It would be trying to connect to a mysql server running in the apaches!
A change to the job queue system in 1.18 to fix an issue where the job runners were hammering the enwiki master resulted in a high number of locks triggering this mysql bug - http://bugs.mysql.com/bug.php?id=49047 (thanks domas!) r99650 removes the lock issue and since deploying, haven't seen any connection errors to db32. I am going to build and package mysql 5.1.52@fb in the near future which includes a fix for mysql 49047, after which we can try reverting r99650. Considering the cause and fix, it definitely seems that bugzilla was the correct place for this.