Last modified: 2012-07-15 22:09:16 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T32705, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 30705 - Charset error on Special:UnusedProperties (1.8 alpha)
Charset error on Special:UnusedProperties (1.8 alpha)
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
Semantic MediaWiki (Other open bugs)
unspecified
All All
: High normal with 2 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
:
: 38140 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-09-02 16:38 UTC by DaSch
Modified: 2012-07-15 22:09 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description DaSch 2011-09-02 16:38:39 UTC
Somehow on this special Page the encoding is wrong. All other pages work correctly
http://www.wecowi.de/wiki/Spezial:Verwaiste_Attribute
Comment 1 DaSch 2011-09-02 17:56:48 UTC
Seams like this is not nessesary a encoding problem but a problem, that the page does not correctly check the existents of the pages. I think the selection is not done correctly. It should check if the page really exists.
Comment 2 DaSch 2011-09-03 19:01:13 UTC
$egMapsDefaultService = "openlayers";
$egMapsAvailableServices = array('googlemaps2', 'yahoomaps', 'openlayers','osm');
and the corosponding API Keys
Comment 3 DaSch 2011-09-03 19:01:36 UTC
(In reply to comment #2)
> $egMapsDefaultService = "openlayers";
> $egMapsAvailableServices = array('googlemaps2', 'yahoomaps',
> 'openlayers','osm');
> and the corosponding API Keys

sorry wrong BUG
Comment 4 Brion Vibber 2011-09-15 22:04:58 UTC
It looks like either the wiki is misconfigured for the database's character set settings, or the database itself has been corrupted with an incorrect Latin1-to-UTF-8 conversion applied on export or import.

Page contents usually survive this because they're stored in binary BLOB fields, but page titles, usernames, edit comments etc may have gotten misconverted.

Try switching the $wgMySQL5 setting and double-check the encodings. Ideally, most newly configured wikis will be set to binary charset/collation -- which allows MediaWiki to speak UTF-8 Unicode without limitations. If things claim to be either latin1 or utf8 and the contents are clearly wrong when viewed directly in the db, it may be incorrectly set up.

This sometimes results from mysqldump operating on wikis that were originally set up with a really old configuration where fields were labeled as Latin1 (such as when upgraded from an old MySQL 4.0 instance, or old versions of MediaWiki that had aimed primarily for MySQL 4.0 compatibility. MySQL 4.0 and earlier have *no* customizable charset support so whatever the default charset was got used, even though we always actually sent/received UTF-8 data. This sometimes results in data getting "converted to UTF-8" by mysqldump, or some other sort of problem.)

Sometimes also the database itself is still ok, but a reconfiguration of the wiki has caused the old settings to be lost and it's now defaulting to using the modern mode, which can end up doing a similar misconversion. Try turning off $wgMySQL5 in this case?
Comment 5 DaSch 2011-09-15 23:04:07 UTC
Please consider the fact, that this is the only special page where this charset error can be seen. All other pages are correct and this page already was displayed correctly on this wiki and there were no changes made in the database exept the one made by update.php and SMW_Admin.php for updating to the new version
Comment 6 DaSch 2011-09-15 23:07:17 UTC
And another special page also for Semantic MediaWiki properties has no problem with special-chars
http://www.wecowi.de/wiki/Spezial:Gew%C3%BCnschte_Attribute
Comment 7 DaSch 2011-09-15 23:14:44 UTC
BTW in my LocalSettings
$wgDBmysql5 = false;
Comment 8 Brion Vibber 2011-09-15 23:19:05 UTC
Ah indeed -- this is some Semantic MediaWiki-only thing? Possibly only some tables are incorrectly set up, or something else in SMW is causing bogus values to end up stored.
Comment 9 DaSch 2011-09-15 23:32:06 UTC
Yes I think so. But to be honest, I took a look at my database and this seams totally confused. Most tables are InnoDB, but not all. Some have utf8_general_ci some have latin1_swedish_ci and one even is in utf8_bin
that's all a bit strange
Comment 10 Dan Bolser 2011-09-16 19:17:40 UTC
What happens if you set all to utf8?
Comment 11 DaSch 2012-07-12 17:27:18 UTC
Anybody from the SMW Team cares about this?
Comment 12 Markus Krötzsch 2012-07-13 08:19:33 UTC
I believe that the cause of this problem is that a temporary table is created to compute the results of this particular special page. The problem then occurs due to incompatible character encoding settings between your existing tables and this new table.

The way in which SMW creates the temporary table on MySQL is:

CREATE TEMPORARY TABLE tablename( title VARCHAR(255) ) ENGINE=MEMORY

The encoding used in this table therefore defaults to the global settings. It is possible that these are not the same as for the other tables. It should be possible to fix the problem by changing all table encodings to be the same, and making sure that this is also the encoding used as a default.

The deeper architectural problem is that MediaWiki uses the global variable $wgDBTableOptions for defining additional options, including specific charsets (I think). However, these global options usually include the ENGINE setting. So we cannot use them for temporary tables. So if somebody changes $wgDBTableOptions to use a non-default charset, SMW will take this into account only for its normal tables. Do you have a custom setting for $wgDBTableOptions?

In the long run, we should also find a more efficient way to compute the results on this special page. The current solution does not scale to bigger wikis.
Comment 13 Markus Krötzsch 2012-07-15 20:52:15 UTC
*** Bug 38140 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links