Last modified: 2013-04-23 17:50:52 UTC
I happened to poke around on ms7 (server with originals of uploaded media), and noticed that in wikipedia/commons/temp there's about 800GB of cruft accumulated from 2008 (!) til now. Filenames all look like YYYYMMDDHHMMSS!YYYYMMDDHHMMSS!phpXXXXXX.png or YYYYMMDDHHMMSS!phpXXXXXX.jpg and you find them in the temp directories underneath every project image dir but commons is the worst case. Sure, I guess we could do a find on 800-something project dirs or on (ugh) the whole 18T filesystem for temp/* but I dunno how feasible that's going to be once media gets moved into Swift. So maybe there could be some sort of MW cleanup job to take care of this periodically. Thoughts?
See also bug 26063.
There is a /maintenance script called cleanupUploadStash.php. It is not on a cron? Surely it can nuke month (or least year) old files floating around?
It is not in any cron job in our puppet repo, that's for sure. We would need to run this across all projects, right? It seems like the script tosses everything older than $wgUploadStashMaxAge hours, not taking a parameter; that value is currently 6. If that's fine by you then yes we can set something up once we're in Swift. Cleanup now would just mean that the files would be saved in a snapshot so we'd wind up using slightly *more* room on the filesystem rather than getting any back.
Though the upload stash items are only a fraction of the temp files. Only 25G are tracked in the commons uploadstash table.
(In reply to comment #3 by Ariel on 2012-04-24) > It is not in any cron job in our puppet repo, that's for sure. For operations/puppet: $:andre\> grep -r cleanupUploadStash . ./manifests/misc/maintenance.pp: command => "/usr/local/bin/foreachwiki maintenance/cleanupUploadStash.php > /dev/null", So is more work needed here? If so, what?
Heh, this is from when ms7 was still serving media. It would be nice to know if cleanup is done for media in Swift; Aaron would know the current state of Upload stash vs Swift, I think there was a general cleanup before import of media into ceph. Perhaps give him a ping?
I don't see anything left to do here.
Thanks Aaron! Closing as WORKSFORME then.