Last modified: 2013-08-21 00:22:13 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T54925, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 52925 - MediaWiki should adapt to case-insensitive host file systems
MediaWiki should adapt to case-insensitive host file systems
Status: RESOLVED INVALID
Product: MediaWiki
Classification: Unclassified
Uploading (Other open bugs)
1.22.0
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-16 16:57 UTC by Mark A. Hershberger
Modified: 2013-08-21 00:22 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Mark A. Hershberger 2013-08-16 16:57:15 UTC
Like Windows.  See also bug#1780
Comment 1 Brion Vibber 2013-08-19 17:47:33 UTC
Case-insensitive host file systems shouldn't be a problem; I've developed and tested MediaWiki on Mac OS X for a decade with no case-related problems on the server side.

Bug 1780 is not about case sensitivity, but rather the fact that PHP programs on Windows see the old "ANSI" (legacy 8-bit or DBCS) encoding on the filesystem instead of being exposed to an UTF-8 encoding of the Unicode filenames.
Comment 2 Bawolff (Brian Wolff) 2013-08-19 19:57:12 UTC
Given we allow people have File:Foo.jpg and File:FoO.jpg, and for them to be separate files, I feel like something bad would probably happen with that on a case insensitive file.
Comment 3 Mark A. Hershberger 2013-08-20 17:54:05 UTC
The problem is, as Brian says, that Windows filesystems cannot distinguish between Foo.jpg and FoO.jpg.

If someone were to move a MW installation from a *nix box to a Windows one (WHY????!!!) they might run into this problem.

I don't know if there is anything that can be done about this (URI-encode filenames? bleh), but I'm pointing out the problem.
Comment 4 Brion Vibber 2013-08-20 20:31:37 UTC
Ah, I see what you mean now. :)


In practice this isn't a big problems; files differing by case will be stored in separate subdirectories because eg 'Foo.jpg' and 'FoO.jpg' don't MD5-hash to the same value:

  $ echo -n 'Foo.jpg' | md5
  0682c25948fa1a2e600dbd7248d6205c
  
  $ echo -n 'FoO.jpg' | md5
  4ddd48ea9e3c52d2fea89e8a46c348f7

So one would be stored under /0/06 and the other under /4/4d by default, and there's no conflict.


You could have a hash collision on the first two digits, but it should be rare to have both a case collision *and* a partial hash collision...

One could force conflicts by disabling the hash subdirectories, though.



Note that the worst possible case is probably a few broken files on a Windows or OS X server. Copy the installation over to Linux/Unix and no new problems should develop...
Comment 5 Bawolff (Brian Wolff) 2013-08-20 20:37:12 UTC
In theory we could change the entire file handling system to store files under an sha1 sum of their contents instead of human readable name... But that'd be a lot of effort.
Comment 6 Mark A. Hershberger 2013-08-21 00:22:13 UTC
I had forgotten about the hashing that Brion brought up.  That makes this much less of an issue.  I don't have more information about this -- someone else pointed out the problem to me and I was reporting it -- so I think this is INVALID.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links