Last modified: 2010-06-01 19:35:54 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T25688, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 23688 - (patch) Mimetypes for Microsoft's 2007 office formats set wrongly
(patch) Mimetypes for Microsoft's 2007 office formats set wrongly
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
File management (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: easy, patch, patch-need-review
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-28 07:57 UTC by Markus Krötzsch
Modified: 2010-06-01 19:35 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
patch for MediaWiki r65816 (1.59 KB, patch)
2010-05-28 07:57 UTC, Markus Krötzsch
Details

Description Markus Krötzsch 2010-05-28 07:57:00 UTC
Created attachment 7411 [details]
patch for MediaWiki r65816

The file ./includes/mime.types assigns the old ppt, doc, etc. mimetypes to the new 2007 pptx, docx, etc. file endings as well. This is wrong, and it leads to problems especially with files served via the wfStreamFile() function: when downloading docx/pptx/... files on machines that run the Windows operating system, the file names get changed to fit the reported mimetype (i.e. the actual file name gets .doc/.ppt/... appended). The files named in this way cannot be successfully opened by Microsoft's office applications any more, since they fail to recognize their files (yes, one should file a bug at MS about this;-).

The solution is to use the correct mimetypes, as found on the Web, e.g. at http://www.bram.us/2007/05/25/office-2007-mime-types-for-iis/
A patch is attached.
Comment 1 Markus Krötzsch 2010-05-28 07:59:22 UTC
P.S. this bug specifically affects users of the img_auth.php image authorisation script since it uses wfStreamFile().
Comment 2 p858snake 2010-05-28 08:24:45 UTC
Keywords: +Need-Review
Comment 3 Derk-Jan Hartman 2010-05-28 12:40:22 UTC
Why not the docm dotm etc file extensions ? These are XML office files with Macro's enabled.
Comment 4 Markus Krötzsch 2010-05-28 12:53:07 UTC
Yes, one could add further mime types as in the above link. But these file types do have mimetypes that start like the old ppt, doc, etc. types, so I reasoned that (1) they are very old and nobody has had any problem with them yet, and (2) the Microsoft tools may manage to open them anyway since the format might be compatible. If there are similar problems in these cases, one could of course use other mimetypes here as well.

For these specifically Microsoft-related issues, it might also be possible to use the infamous application/octet-stream as a default. This should at least prevent any renaming, and may suffice to let the operating system apply some more brains to figure out what the file is supposed to be.
Comment 5 Derk-Jan Hartman 2010-05-28 13:07:56 UTC
Somewhat more authoritative source for MS mimetypes.

http://blogs.msdn.com/b/vsofficedeveloper/archive/2008/05/08/office-2007-open-xml-mime-types.aspx

I think we should just add all of them. I see no reason in doing half work in this case. They are documented, so we should implement it as such. We have had too many unnecessary problems with mime type and file extension mismatches already.
Comment 6 Derk-Jan Hartman 2010-06-01 19:20:39 UTC
Done in r67196
Comment 7 Bawolff (Brian Wolff) 2010-06-01 19:35:54 UTC
docx is a zip format. Due to bug 23642 (MimeMagic::detectZipType does not recognize microsoft formats) they still won't be detected properly.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links