Last modified: 2014-05-16 17:12:27 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T67338, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 65338 - action=upload overwrites files despite ignorewarnings is not set
action=upload overwrites files despite ignorewarnings is not set
Status: RESOLVED WORKSFORME
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
unspecified
All All
: Unprioritized normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on: 65405
Blocks:
  Show dependency treegraph
 
Reported: 2014-05-15 09:34 UTC by Rainer Rillke @commons.wikimedia
Modified: 2014-05-16 17:12 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Rainer Rillke @commons.wikimedia 2014-05-15 09:34:19 UTC
Original bug title:
action=upload overwrites files despite ignorewarnings is not set

Follow-up from Bug 64883

Description:
UploadWizard does never set ignorewarnings when publishing files through API. And the API upload module is supposed to give a warning if a file exists. Thus, Upload Wizard isn't able [in theory] to overwrite files. But exactly this happened because the API returned "success" instead of a warning.

There are plenty of examples in Bug 64883
Comment 1 Brad Jorsch 2014-05-15 15:50:13 UTC
Just tried it locally. Attempting to upload File:Bug65338.png via API to my local wiki without specifying ignorewarnings correctly gives a warning that the file already exists and did not process the upload with a "Upload warnings: exists" error. So I'm closing this as WORKSFORME.

It may be that the situation you're complaining about is only triggered by certain filenames or the like, but in that case the bug is likely in the file upload code rather than in the API. You might try to see if the same bug occurs via Special:Upload to confirm that. Or it may be that UploadWizard actually is specifying ignorewarnings (there is certainly code to set that parameter it in the extension).

Also, bug 64883 is long and confused with many different issues being discussed. I'm not able to work out which things there might be relevant to this bug. In the future it would be helpful to copy the relevant information into the new bug, and also if possible to include a log of the exact API queries that can be used to reproduce the problem.
Comment 2 Rainer Rillke @commons.wikimedia 2014-05-15 16:26:35 UTC
> It may be that the situation you're complaining about is only triggered by 
> certain filenames or the like

Yes, I guess it has something to do with WMF setup *but* I cannot say for sure, hence this bug. What I can say for sure is that igorewarnings wasn't set and the file at https://commons.wikimedia.org/wiki/File:Treppe_2222_test_upload.jpg got through anyway.

> UploadWizard actually is specifying ignorewarnings
In the following cases only:
1) Uploading to stash: Upload through iframe
2) Uploading to stash: FormData upload
3) Publishing: ['was-deleted'] (the user is uploading a file under a title that previously existed but was deleted)
4) Publishing: ['duplicate-archive'] (the a file with the same check sum was existing before on that wiki but is now deleted)

None of these 3 possibilities was the case at [[File:Treppe_2222_test_upload.jpg]]. I was also unable to reproduce this locally but it is an issue at Commons.
Comment 3 Brad Jorsch 2014-05-15 16:51:57 UTC
It would be helpful if you'd post the exact data that UploadWizard is posting (although you can redact the content of the file if it's large, and of course redact any tokens). I just tried uploading via API on Commons with https://commons.wikimedia.org/wiki/File:Anomie_test_for_bug_65338.png and got the correct error response when I tried to reupload a different version.

For good measure, I tried both direct upload and upload first to stash. Both worked correctly.

FYI, my upload data was:

 {
        'maxlag' => '5',
        'file' => [ '/tmp/red.png', 'Anomie test for bug 65338.png' ],
        'filename' => 'Anomie test for bug 65338.png',
        'token' => [REDACTED],
        'comment' => 'This file is a test for [[bugzilla:65338]]. Anomie will flag it for deletion when done. Sorry for the inconvenience.',
        'watchlist' => 'nochange',
        'format' => 'json',
        'action' => 'upload'
 }
Comment 4 Rainer Rillke @commons.wikimedia 2014-05-15 17:21:35 UTC
UploadWizard does not set the maxlag param.


Requests:
-----
action	upload
comment	User created page with UploadWizard
filekey	12anztrx74rw.voupfh.1178694.jpg
filename	Koellner_dom_test.jpg
format	json
text	[...]
token	[REDACTED]
-----
action	upload
comment	User created page with UploadWizard
filekey	12anztspxrzw.nz2a22.1178694.jpg
filename	Koellner_dom_test.jpg
format	json
text	[...]
token	[REDACTED]
-----

Results: http://pastebin.de/125362
Comment 5 Rainer Rillke @commons.wikimedia 2014-05-15 17:23:10 UTC
Convenience link: https://commons.wikimedia.org/wiki/File:Koellner_dom_test.jpg

UploadWizard is always using the stash.
Comment 6 Brad Jorsch 2014-05-15 17:39:50 UTC
What about the API calls that do the stash and return the filekey? Not that I'm expecting to find anything there, but for just in case.
Comment 7 Rainer Rillke @commons.wikimedia 2014-05-15 21:30:18 UTC
(In reply to Brad Jorsch from comment #6)
> What about the API calls that do the stash and return the filekey?

Request:
http://pastebin.de/125379
(with ignorewarnings because warnings should not prevent upload here)

Response:
http://pastebin.de/125380
Comment 8 Brad Jorsch 2014-05-16 14:43:34 UTC
Hmm. When I try to use UploadWizard to reproduce this issue, it refuses to even let me attempt the upload due to the duplicate filename. Are you doing something strange like submitting multiple uploads to the same name at the same time?
Comment 9 Rainer Rillke @commons.wikimedia 2014-05-16 15:05:01 UTC
(In reply to Brad Jorsch from comment #8)
> something strange like submitting multiple uploads to the same name at the same 
> time

I am using Upload Wizard. The files I choose to upload are named like that:
** Treppe_2222 test upload.jpg
** Treppe 2222 test_upload.jpg
** Treppe 2222_test_upload.jpg

I upload these three files at once and Upload Wizard is "publishing" [moving from stash to public image storage] them "at the same time" [sending 3 requests at once]. I don't have the time to check but you possibly have to set "Maximum number of concurrent uploads" at https://commons.wikimedia.org/wiki/Special:Preferences#mw-prefsection-uploads to three.
Comment 10 Rainer Rillke @commons.wikimedia 2014-05-16 15:06:38 UTC
The UI result of doing so or similar is at https://bug-attachment.wikimedia.org/attachment.cgi?id=15297
Comment 11 Brad Jorsch 2014-05-16 15:26:27 UTC
Ok, I think we've gotten to the bottom of this now.

There's a race condition in the uploading code: it first checks for warnings and then processes the upload, and what's happening here is that one of your parallel requests is sneaking in an upload between the "check for warnings" and "process the upload" steps.

If the "process the upload" backend code were to have flags to say "do not overwrite" and "overwrite only", we could use that from the API code to fix this race condition.
Comment 12 Rainer Rillke @commons.wikimedia 2014-05-16 16:34:48 UTC
(In reply to Brad Jorsch from comment #11)
Thanks for tracking this down, Brad.

> If the "process the upload" backend code were to have flags to say "do not 
> overwrite" and "overwrite only", we could use that from the API code to fix 
> this race condition.
A new bug demanding that? The new bug blocking this bug?
Or can the API just block other uploads with the same title while one is being processed? Is there shared memory between the servers that run the API code except the database?

Just as a side note: It appears to happen more or less frequently that people upload two files under the same title by accident. This is, due to the flow of UploadWizard, which encourages uploading directly from your camera's folder. Then the user is renaming the files and can easily miss that there's a file with the same name already in the list. I’ve submitted a patch for UplaodWizard but I would welcome if we could make the API code also safe for this condition.
Comment 13 Brad Jorsch 2014-05-16 16:57:17 UTC
(In reply to Rainer Rillke @commons.wikimedia from comment #12)
> A new bug demanding that? The new bug blocking this bug?

Go ahead.

> Or can the API just block other uploads with the same title while one is
> being processed? Is there shared memory between the servers that run the API
> code except the database?

There is shared memory, but using it from the API in this way would probably not be the greatest idea. It would be better for such a thing to live in the backend upload-handling code.

> Then the user is renaming the files and can easily miss
> that there's a file with the same name already in the list.

Yeah, UploadWizard should be checking its list for duplicates (post-normalization, of course).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links