Last modified: 2013-04-23 17:50:52 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T38085, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 36085 - temp files left around from some uploads (maybe uploads of new versions only?)
temp files left around from some uploads (maybe uploads of new versions only?)
Status: RESOLVED WORKSFORME
Product: MediaWiki
Classification: Unclassified
Uploading (Other open bugs)
unspecified
All All
: High normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-04-19 07:03 UTC by Ariel T. Glenn
Modified: 2013-04-23 17:50 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Ariel T. Glenn 2012-04-19 07:03:13 UTC
I happened to poke around on ms7 (server with originals of uploaded media), and noticed that in wikipedia/commons/temp there's about 800GB of cruft accumulated from 2008 (!) til now. Filenames all look like YYYYMMDDHHMMSS!YYYYMMDDHHMMSS!phpXXXXXX.png or YYYYMMDDHHMMSS!phpXXXXXX.jpg and you find them in the temp directories underneath every project image dir but commons is the worst case.

Sure, I guess we could do a find on 800-something project dirs or on (ugh) the whole 18T filesystem for temp/* but I dunno how feasible that's going to be once media gets moved into Swift.  So maybe there could be some sort of MW cleanup job to take care of this periodically.  Thoughts?
Comment 1 Aaron Schulz 2012-04-24 00:47:28 UTC
See also bug 26063.
Comment 2 Aaron Schulz 2012-04-24 01:09:27 UTC
There is a /maintenance script called cleanupUploadStash.php. It is not on a cron? Surely it can nuke month (or least year) old files floating around?
Comment 3 Ariel T. Glenn 2012-04-24 06:34:02 UTC
It is not in any cron job in our puppet repo, that's for sure.  We would need to run this across all projects, right? 

It seems like the script tosses everything older than $wgUploadStashMaxAge hours, not taking a parameter; that value is currently 6. If that's fine by you then yes we can set something up once we're in Swift.  Cleanup now would just mean that the files would be saved in a snapshot so we'd wind up using slightly *more* room on the filesystem rather than getting any back.
Comment 4 Aaron Schulz 2012-04-24 15:27:56 UTC
Though the upload stash items are only a fraction of the temp files. Only 25G are tracked in the commons uploadstash table.
Comment 5 Andre Klapper 2013-03-24 20:10:23 UTC
(In reply to comment #3 by Ariel on 2012-04-24)
> It is not in any cron job in our puppet repo, that's for sure.

For operations/puppet:
$:andre\> grep -r cleanupUploadStash .
./manifests/misc/maintenance.pp:	command => "/usr/local/bin/foreachwiki maintenance/cleanupUploadStash.php > /dev/null",

So is more work needed here? If so, what?
Comment 6 Ariel T. Glenn 2013-03-25 11:18:50 UTC
Heh, this is from when ms7 was still serving media. It would be nice to know if cleanup is done for media in Swift; Aaron would know the current state of Upload stash vs Swift, I think there was a general cleanup before import of media into ceph.  Perhaps give him a ping?
Comment 7 Aaron Schulz 2013-04-23 16:35:03 UTC
I don't see anything left to do here.
Comment 8 Andre Klapper 2013-04-23 17:50:52 UTC
Thanks Aaron! Closing as WORKSFORME then.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links