Last modified: 2013-10-02 13:22:50 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T55751, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 53751 - CentralNotice banner filter is case sensitive
CentralNotice banner filter is case sensitive
Status: ASSIGNED
Product: MediaWiki extensions
Classification: Unclassified
CentralNotice (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Matt Walker
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-04 14:34 UTC by MZMcBride
Modified: 2013-10-02 13:22 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description MZMcBride 2013-09-04 14:34:15 UTC
If I go to <https://meta.wikimedia.org/wiki/Special:CentralNoticeBanners> and enter the text "rory" in the filter field, I get no results.

If I enter the text "Rory" in the filter field, I get one result:

[x] PrivacyPolicyDiscussion_Rory1 (Preview on-wiki)

Search should be case insensitive.
Comment 1 Matt Walker 2013-09-05 07:58:18 UTC
Fascinating. The prototype SQL schema and the one on metawiki differ! We will have to run an alter on the table to get it to be case insensitive.

Metawiki cn_templates.tmp_name is varbinary(255)
CentralNotice.sql cn_templates.tmp_name is varchar(255)

So the alter would be:
ALTER TABLE cn_templates MODIFY COLUMN tmp_name varchar(255);

Wonder if I can just do this tomorrow -- I'll ask Asher.
Comment 2 Sean Pringle 2013-10-01 13:51:23 UTC
The cn_templates table is small, so the above ALTER would be OK to just run on metawiki master. Thank you for asking first :-)

Although, I notice that other metawiki cn_% tables with VARCHAR fields in the schema show as VARBINARY too. Does this deserve a more thorough investigation?
Comment 3 Matt Walker 2013-10-02 00:08:49 UTC
(In reply to comment #2)
> Does this deserve a more thorough investigation?

So, in doing a more thorough look through I noticed that in fact:
* All table rows in CentralNotice are binary types
* The default character set and collation for all the wikis that I checked is binary
** As CN does not specify a table charset/collation we take the databases which is binary -- which is why all our columns are binary.

The answer to the historical question of "Why are all our tables by default binary charset/collation instead of utf8?" is apparently:

(05:00:55 PM) ori-l: mysql's 'utf8' wasn't real utf8 until a fairly recent version (5.1 iirc)
(05:00:55 PM) ori-l: it was restricted to the basic multilingual plane
(05:01:24 PM) ori-l: which is a problem if you want to support certain languages
(05:01:31 PM) ori-l: and we do

-- so --
I'm going to send an email to the wikitech list about how to resolve this problem.
Comment 4 Matt Walker 2013-10-02 02:14:28 UTC
(In reply to comment #3)
> I'm going to send an email to the wikitech list about how to resolve this
> problem.
For posterity -- http://www.gossamer-threads.com/lists/wiki/wikitech/394239
Comment 5 MZMcBride 2013-10-02 02:57:32 UTC
Any alterations to the database schema should be wrapped in a maintenance script and/or committed to a Git repo as an SQL patch. Other people presumably use this extension. :-)
Comment 6 Sean Pringle 2013-10-02 13:22:50 UTC
Well my understanding is the listed ALTER was actually to bring the schema back inline with the schema already committed to the repo. Not an SQL patch per se; the production schema is simply wrong.

But I agree it should become a maintenance script if the problem is more widespread or indicative of a design flaw with regard to character sets. Switching wholesale to utf8 has many implications.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links