Last modified: 2012-07-06 17:00:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T37609, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 35609 - Wikimedia databases contains categorylinks with type "page" for media files. Run updateCollation.php to fix
Wikimedia databases contains categorylinks with type "page" for media files. ...
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
http://toolserver.org/~endumen/filesw...
: shell
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-30 12:40 UTC by Lejonel
Modified: 2012-07-06 17:00 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Commons files with type "page" in categorylinks table (deleted)
2012-03-30 12:40 UTC, Lejonel
Details

Description Lejonel 2012-03-30 12:40:10 UTC
Created attachment 10352 [details]
Commons files with type "page" in categorylinks table

The Commons database contains categorylinks for files with type "page". But files should have type "file". 

I think this error may exist in other databases too. Bug 29787 was about a category in English Wikipedia, but that was fixed by null edits. Now null edits does not seem to fix the errors at Commons.

Bug 29787 has an example of problems caused by this bug.
Comment 1 Mark A. Hershberger 2012-04-02 16:57:41 UTC
Moving attachment contents to url field.
Comment 2 Mark A. Hershberger 2012-04-02 16:58:19 UTC
The content of attachment 10352 [details] has been deleted by
    Mark A. Hershberger <mah@everybody.org>
who provided the following reason:

should be in url field

The token used to delete this attachment was generated at 2012-04-02 16:58:03 UTC.
Comment 3 Bawolff (Brian Wolff) 2012-06-16 19:30:42 UTC
All of these appear to have cl_collation field set to ''. Thus running updateCollations.php should fix the issue.

My query on toolserver db said that there were 48331 such affected rows on commons. And they seem to have happened on July 7, 2011. I don't know what happened at that time. The pages I looked at weren't edited at that time. In the server admin log the only thing happening to commons at that time that i saw was CleanupTitles.php and NamespaceDupesWT.php were running. I'm not sure how that could cause this.

Anyhow running updateCollation.php should fix the issue as it will see it as an old style row and update it appropriately.
Comment 4 Bawolff (Brian Wolff) 2012-06-16 19:34:43 UTC
Note, there is also 15695 such rows on enwiki.
Comment 5 Sam Reed (reedy) 2012-07-06 16:39:42 UTC
mysql> explain select count(*) from categorylinks where cl_collation = '';
+----+-------------+---------------+------+---------------+--------------+---------+-------+--------+--------------------------+
| id | select_type | table         | type | possible_keys | key          | key_len | ref   | rows   | Extra                    |
+----+-------------+---------------+------+---------------+--------------+---------+-------+--------+--------------------------+
|  1 | SIMPLE      | categorylinks | ref  | cl_collation  | cl_collation | 34      | const | 115866 | Using where; Using index |
+----+-------------+---------------+------+---------------+--------------+---------+-------+--------+--------------------------+
1 row in set (0.00 sec)


Running updateCollation.php against commonswiki currently...
Comment 6 Sam Reed (reedy) 2012-07-06 16:41:24 UTC
mysql> explain select count(*) from categorylinks where cl_collation != 'uppercase';
+----+-------------+---------------+-------+---------------+--------------+---------+------+-------+--------------------------+
| id | select_type | table         | type  | possible_keys | key          | key_len | ref  | rows  | Extra                    |
+----+-------------+---------------+-------+---------------+--------------+---------+------+-------+--------------------------+
|  1 | SIMPLE      | categorylinks | range | cl_collation  | cl_collation | 34      | NULL | 60341 | Using where; Using index |
+----+-------------+---------------+-------+---------------+--------------+---------+------+-------+--------------------------+
1 row in set (0.02 sec)
Comment 7 Sam Reed (reedy) 2012-07-06 16:54:00 UTC
Enwiki is clean now.

Running it via foreachwiki, seems most wikis are clean, but not all, just noticed cawiki and dewiki weren't (for a couple of examples)
Comment 8 Sam Reed (reedy) 2012-07-06 17:00:36 UTC
Doned

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links