Last modified: 2014-11-20 23:42:32 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T73719, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 71719 - Ignore EXIF data in CommonsMetadata
Ignore EXIF data in CommonsMetadata
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
CommonsMetadata (Other open bugs)
master
All All
: Unprioritized normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-10-06 20:10 UTC by Guillaume Paumier
Modified: 2014-11-20 23:42 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Guillaume Paumier 2014-10-06 20:10:02 UTC
Some files edited with Picasa apparently have "Picasa" as their author in the EXIF metadata. 

Example: https://www.mediawiki.org/wiki/File%3ADarkvector_screenshot.jpg

Obviously, the author isn't Picasa, so I'm wondering if we should have some sort of blacklist in CommonsMetadata to ignore the author field if it matches things that we know not to be authors. It's probably better not to show anything rather than showing something we know to be untrue.

(This is similar to bug 58195 except in this case there isn't any other authorship information in the wikitext.)
Comment 1 Guillaume Paumier 2014-10-06 20:16:50 UTC
Other examples: 

https://www.mediawiki.org/wiki/File%3ADisappearing_Username-1.jpg "Picasa 2.7" as author

https://www.mediawiki.org/wiki/File%3AExtreme-testing-language-engineering.svg "Created with Raphaël 2.1.0" as title

https://www.mediawiki.org/wiki/File%3AFor_talk_simple_security.JPG "Picasa 2.6" as author

https://www.mediawiki.org/wiki/File%3AMediaWiki_Homepage_Proposal.svg
Short title	Untitled
Image title	Generated with SwordSoft Layout

https://www.mediawiki.org/wiki/File%3ARegular_expression_complexity_exploit.svg
Short title	Qt Svg Document
Image title	Generated with Qt
Comment 2 Guillaume Paumier 2014-10-29 22:19:53 UTC
And more:

https://wikimediafoundation.org/wiki/File%3AGilt_silver_jar_with_pattern_of_dancing_horses.jpg has "OLYMPUS DIGITAL CAMERA" as image title
Comment 3 Tisza Gergő 2014-10-29 23:48:17 UTC
I wonder if EXIF shouldn't be ignored completely. Mostly it seems to be autogenerated and less than helpful. E.g. some cameras apparently put something like IMG1234 to the title.
Comment 4 Guillaume Paumier 2014-11-11 00:18:19 UTC
I personally don't have enough data to decide if it makes sense to ignore completely, but I trust your judgment on that. It does seem like we have a lot of false positives.
Comment 5 Guillaume Paumier 2014-11-20 23:42:32 UTC
After encountering more and more of these, like https://de.wikibooks.org/wiki/Datei%3ABenutzerMKabel.jpg , I'm starting to agree with you. 

I still believe a handful of users (including me) curate their EXIF metadata, for example by adding information using a digital collection management software, and we should support that, but this is already done (or should be) at the time of upload by extracting that data and prefilling the fields. Using them afterwards through CommonsMetadata seems to be more trouble that it's worth.

Adjusting the title of this request accordingly.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links