Last modified: 2014-08-07 18:53:33 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T66831, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 64831 - GWT duplicates
GWT duplicates
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
GWToolset (Other open bugs)
unspecified
All All
: High major (vote)
: ---
Assigned To: dan
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-05-04 16:07 UTC by Steinsplitter
Modified: 2014-08-07 18:53 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
test metadataset (2.31 KB, text/xml)
2014-05-11 22:57 UTC, dan
Details

Description Steinsplitter 2014-05-04 16:07:01 UTC
GWT should prevent the upload of duplicates.
Comment 1 2014-05-04 16:34:00 UTC
What would be *great* would be if the GWT were to skip duplicates but complete the requested run, and then report back on SHA-1 duplicates, possibly supplying an xml exceptions file (of <records>) with only the duplicates in it, and preferably with the filename of the duplicated file(s) found in an extra field.

If the user then had the option of setting a flag to force the creation of duplicates at that point, using the xml exceptions file, at least they would be wholly responsible for their actions, could add a "(duplicate check needed)" backlog category as appropriate, and should expect to deal with the duplicates themselves, rather than putting this on other random volunteers.
Comment 2 Gerrit Notification Bot 2014-05-11 00:37:47 UTC
Change 132751 had a related patch set uploaded by Siebrand:
Don’t allow upload of duplicate mediafiles

https://gerrit.wikimedia.org/r/132751
Comment 3 Gerrit Notification Bot 2014-05-11 01:00:30 UTC
Change 132751 had a related patch set uploaded by Siebrand:
Don’t allow upload of duplicate mediafiles

https://gerrit.wikimedia.org/r/132751
Comment 4 dan 2014-05-11 22:57:35 UTC
Created attachment 15350 [details]
test metadataset
Comment 5 dan 2014-05-11 22:58:00 UTC
steps to reproduce
==================
notice current item
-------------------
1. notice how many mediafiles are present for this item and take note as to
   whether or not they are the same:
   http://commons.wikimedia.beta.wmflabs.org/wiki/File:Een_vrouw_brengt_een_offer_aan_Priapus_()-Sc%C3%A8nes_uit_Vergilius_dichtbundel_Bucolica_(serietitel)-RP-P-1992-80-RM0001.COLLECT.70.jpeg

login
-----
1. http://commons.wikimedia.beta.wmflabs.org/wiki/Special:GWToolset
2. once logged in and at Step 1: Metadata detection

step 1
------
1. nothing to add
2. select Artwork
3. GWToolset:Metadata Mappings/Dan-nl/Rijksmuseum.json
4. nothing to add
5. choose the attached “test metadataset”
6. click Submit

step 2
------
1. check “Re-upload media from URL”
2. click the "Preview batch" button

step 3
------
click the “Process batch” button

note the item change
--------------------
1. there should be yet another copy of the same mediafile
   http://commons.wikimedia.beta.wmflabs.org/wiki/File:Een_vrouw_brengt_een_offer_aan_Priapus_()-Sc%C3%A8nes_uit_Vergilius_dichtbundel_Bucolica_(serietitel)-RP-P-1992-80-RM0001.COLLECT.70.jpeg
Comment 6 dan 2014-06-13 09:49:46 UTC
steinsplitter, this has been deployed to production. are you okay with marking it as resolved fixed?
Comment 7 dan 2014-06-26 07:50:05 UTC
steinsplitter, a patch has been deployed to production that addresses this issue. are you okay with closing this bug now?
Comment 8 Steinsplitter 2014-08-07 18:53:33 UTC
Thank you

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links