Last modified: 2013-09-04 10:36:18 UTC
On my OpenSuse sytem, the upgrade failes afer a while: ... ...iwlinks table already exists. ...iwl_prefix_title_from key already set on iwlinks table. ...have ul_value field in updatelog table. ...have iw_api field in interwiki table. ...iwl_prefix key doesn't exist. ...iwl_prefix_from_title key doesn't exist. Adding cl_collation field to table categorylinks...PHP Warning: mysql_query(): MySQL server has gone away in /usr/share/mediawiki/includes/db/DatabaseMysql.php on line 23 PHP Warning: mysql_query(): Error reading result set's header in /usr/share/mediawiki/includes/db/DatabaseMysql.php on line 23 DB connection error: Connection refused (localhost) MySQL: 5.1.51 PHP: 5.3.5
Created attachment 8665 [details] This patch fixes the update for mentioned MySQL 5.1.53 Server
Marking as tarball blocker pending further investigation.
'server has gone away' usually means one of: * some bit of data transferred was too large for mysql's max packet size setting, thus the connection was cut off * the connection was idle for too long while waiting for a query to complete, thus the connection was cut off The stuff for cl_collation seems to be mostly ALTER TABLE-y so shouldn't transfer data, so timeout is the most likely. I remember us having that sort of problem with long-running dump scripts and such in the past... there's a database::setTimeout() method, which backup.inc's BackupDumper class triggers on its dedicated DB connection: function backupDb() { $this->lb = wfGetLBFactory()->newMainLB(); $db = $this->lb->getConnection( DB_SLAVE, 'backup' ); // Discourage the server from disconnecting us if it takes a long time // to read out the big ol' batch query. $db->setTimeout( 3600 * 24 ); return $db; } Might make sense to bump the timeouts during updates too?
Johannes, how large is your database?
It's quite small. select count(*) from categorylinks; shows a total of 21 lines. I'm sure it's a mysql bug; I have digged the mysqld-log and found the following: 110615 10:17:14 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.1.51-ndb-7.1.9a-log' socket: '/var/run/mysql/mysql.sock' port: 3306 SUSE MySQL RPM 110615 10:19:56 - mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=16777216 read_buffer_size=262144 max_used_connections=1 max_threads=151 threads_connected=1 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 133916 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd: 0x1479d30 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f1778e6be88 thread_stack 0x40000 /usr/sbin/mysqld(my_print_stacktrace+0x29) [0x95e4a9] /usr/sbin/mysqld(handle_segfault+0x400) [0x6387c0] /lib64/libpthread.so.0(+0xf2d0) [0x7f177e4d82d0] /usr/sbin/mysqld() [0x733379] /usr/sbin/mysqld(mysql_alter_table(THD*, char*, char*, st_ha_create_information*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool)+0x146f) [0x734caf] /usr/sbin/mysqld(mysql_execute_command(THD*)+0xe1b) [0x646b2b] /usr/sbin/mysqld(mysql_parse(THD*, char*, unsigned int, char const**)+0x2d3) [0x64d133] /usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x542) [0x64d682] /usr/sbin/mysqld(do_command(THD*)+0xea) [0x64e9da] /usr/sbin/mysqld(handle_one_connection+0x22d) [0x6401dd] /lib64/libpthread.so.0(+0x6a3f) [0x7f177e4cfa3f] /lib64/libc.so.6(clone+0x6d) [0x7f177cc5767d] Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at 0x14d83e0 = ALTER /* DatabaseBase::sourceFile( /usr/share/mediawiki/maintenance/archives/patch-categorylinks-better-collation.sql ) */ TABLE `categorylinks` CHANGE COLUMN cl_sortkey cl_sortkey varbinary(230) NOT NULL default '', ADD COLUMN cl_sortkey_prefix varchar(255) binary NOT NULL default '', ADD COLUMN cl_collation varbinary(32) NOT NULL default '', ADD COLUMN cl_type ENUM('page', 'subcat', 'file') NOT NULL default 'page', ADD INDEX (cl_collation), DROP INDEX cl_sortkey, ADD INDEX cl_sortkey (cl_to, cl_type, cl_sortkey, cl_from) thd->thread_id=8 thd->killed=NOT_KILLED The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 110615 10:19:56 mysqld_safe Number of processes running now: 0 110615 10:19:56 mysqld_safe mysqld restarted
I have filed a bug in http://bugs.mysql.com/bug.php?id=61528
upstream bug that shouldn't block release, but perhaps be added to the release notes
I was able to reproduce this with the mysql-cluster-server package in Ubuntu. See the bug on mysql for the exact steps.
If it still works with the other code on other versions, why not just change the sql query as a workaround?
(In reply to comment #9) > If it still works with the other code on other versions, why not just change > the sql query as a workaround? I'm not against that -- at all -- but I'm not going to say that we *should* put that in the 1.17 tarball. Adding Tim to this bug so he can make a decision either way.
The MySQL bug should be isolated before we apply any workarounds. If it's not isolated, then we don't know if the workaround really fixes it, or if it just makes the segfault somewhat less likely.
I reproduced it under gdb with debugging symbols, and I've isolated it to some extent. The bug occurs when an index is added to a new field, i.e. a field added in the same ALTER TABLE. In sql_table.cc near line 6010: /* Key not found. Add the offset of the key to the add buffer. */ ha_alter_info->index_add_buffer [ha_alter_info->index_add_count++]= new_key - ha_alter_info->key_info_buffer; key_part= new_key->key_part; end= key_part + new_key->key_parts; for(; key_part != end; key_part++) { /* Mark field to be part of new key */ if ((field= table->field[key_part->fieldnr])) field->flags|= FIELD_IN_ADD_INDEX; new_key->key_part->fieldnr appears to be a field index in the new table, but it's used to attempt to fetch field information from the old table. (gdb) print *new_key $30 = {key_length = 32, flags = 40, key_parts = 1, extra_length = 0, usable_key_parts = 2, block_size = 0, algorithm = HA_KEY_ALG_UNDEF, {parser = 0x0, parser_name = 0x0}, key_part = 0x7ffff8a26208, name = 0x7ffff8a22f50 "cl_collation", rec_per_key = 0x0, handler = {bdb_return_if_eq = 0}, table = 0x0} (gdb) print *key_part $31 = {field = 0x0, offset = 751, null_offset = 0, length = 32, store_length = 0, key_type = 1, fieldnr = 5, key_part_flag = 0, type = 0 '\000', null_bit = 0 '\000'} (gdb) print key_part->fieldnr $15 = 5 (gdb) print table->field[0]->field_name $33 = 0x7ffff8a24599 "cl_from" (gdb) print table->field[1]->field_name $34 = 0x7ffff8a245a1 "cl_to" (gdb) print table->field[2]->field_name $35 = 0x7ffff8a245a7 "cl_sortkey" (gdb) print table->field[3]->field_name $36 = 0x7ffff8a245b2 "cl_timestamp" (gdb) print table->field[4]->field_name Cannot access memory at address 0x30 (gdb) print table->field[5]->field_name warning: can't find linker symbol for virtual table for `Field' value warning: found `Field_longlong::~Field_longlong()' instead $37 = 0x7ffff78ff544 "UH\211\345SH\203\354hH\211}\250H\211u\240\211U\234dH\213\004%(" table->field[5] points into arbitrary memory, and so the attempt to write to table->field[5]->flags causes a segfault. I'll add this to the MySQL bug once dev.mysql.com stops timing out.
This reduced test case also works: CREATE TABLE a ( a INT ) ENGINE=MyISAM; ALTER TABLE a ADD COLUMN b INT, ADD INDEX (b); I think this is too broken to bother trying to fix on our side. We must have loads of patches that add indexes on new columns. I can't believe that 1.17 is the first major version that doesn't work on MySQL Cluster. Lowering priority.
Does the patch from Johannes still need review?
The patch looks sane, and should work on other versions too. But per Tims comment (which I agree with, as it seems very strange/borderline) (In reply to comment #13) > I think this is too broken to bother trying to fix on our side. We must have > loads of patches that add indexes on new columns. I can't believe that 1.17 is > the first major version that doesn't work on MySQL Cluster. Lowering priority. Certainly, I personally can't see any issue just changing the db patch per the above patch, as it shouldn't cause any more issues
+platformeng So should we apply patch for 1.20 or should we just drop the bug report as it is an upstream issue?
(In reply to comment #16) > +platformeng > > So should we apply patch for 1.20 or should we just drop the bug report as it > is an upstream issue? Separating the alter operation in two will result in upgrade running twice as slow, because it will rebuild the table twice instead of once. Could be noticeable on large installations.
(In reply to comment #16) > So should we apply patch for 1.20 or should we just drop the bug report as it > is an upstream issue? Is there a reason this couldn't be applied to the 1.19 tarball?
(In reply to comment #18) > (In reply to comment #16) > > So should we apply patch for 1.20 or should we just drop the bug report as it > > is an upstream issue? > > Is there a reason this couldn't be applied to the 1.19 tarball? Yes, it makes upgrades slower and it doesn't fix the bug. As I said in comment #13, I don't think we should make any changes. It's not our problem.
So I am closing this bug per comment 13 and comment 17.