Last modified: 2012-01-11 18:33:12 UTC
"The middleware inserts the account name into the URL, converts the "wikipedia/commons" section into a Swift container name by replacing slash with %2F, adds "%2Fthumb" or "%2Farchived" or "%2Fdeleted" to the container name and adds the rest of the hashing and filename as the object name" To be clear their is the cloudfiles interface and the rewrite.py script in the WSGI stack. This is about the later. "archived" should not be added to the container name (and actually isn't in the code). On the other hand, "temp" and "public" should. Examples of how rewrites should happen (ignoring container sharding): ==== Original URL: upload.wikimedia.org/site/lang/a/ab/file.jpg Swift URL: site-lang-images-public/a/ab/file.jpg Original URL: upload.wikimedia.org/site/lang/thumb/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-images-thumb/a/ab/file.jpg/120px-file.jpg Original URL: upload.wikimedia.org/site/lang/thumb/archive/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-images-thumb/archive/a/ab/file.jpg/120px-file.jpg Original URL: upload.wikimedia.org/site/lang/temp/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-images-temp/a/ab/file.jpg/120px-file.jpg ==== The above would be consistent with FileRepo/FileBackend.
*** CHANGE TO THE ABOVE *** We are going to use "media-" instead of "images-". Given that, we want: ==== Original URL: upload.wikimedia.org/site/lang/a/ab/file.jpg Swift URL: site-lang-media-public/a/ab/file.jpg Original URL: upload.wikimedia.org/site/lang/thumb/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-media-thumb/a/ab/file.jpg/120px-file.jpg Original URL: upload.wikimedia.org/site/lang/thumb/archive/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-media-thumb/archive/a/ab/file.jpg/120px-file.jpg Original URL: upload.wikimedia.org/site/lang/temp/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-media-temp/a/ab/file.jpg/120px-file.jpg ====
FYI: There should be no rewrite rules going to the site-lang-media-deleted container and it should also be restricted from the swift user the proxy uses to autenticate when it rewrites URLs.
Another one we want: ==== Original URL: upload.wikimedia.org/site/lang/thumb/temp/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-media-thumb/temp/a/ab/file.jpg/120px-file.jpg ====
'media' was replaced with the repo name, so we now have: ==== Original URL: upload.wikimedia.org/site/lang/a/ab/file.jpg Swift URL: site-lang-local-public/a/ab/file.jpg Original URL: upload.wikimedia.org/site/lang/thumb/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-local-thumb/a/ab/file.jpg/120px-file.jpg Original URL: upload.wikimedia.org/site/lang/thumb/archive/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-local-thumb/archive/a/ab/file.jpg/120px-file.jpg Original URL: upload.wikimedia.org/site/lang/thumb/temp/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-local-thumb/temp/a/ab/file.jpg/120px-file.jpg Original URL: upload.wikimedia.org/site/lang/temp/a/ab/file.jpg/120px-file.jpg Swift URL: site-lang-local-temp/a/ab/file.jpg/120px-file.jpg ====
If we shard a container, do the hashes still wind up in the filename? Consider: Original URL: upload.wikimedia.org/site/lang/a/ab/file.jpg Swift URL (a): site-lang-media-public-ab/a/ab/file.jpg Swift URL (b): site-lang-media-public-ab/file.jpg Original URL: upload.wikimedia.org/site/lang/thumb/archive/a/ab/file.jpg/120px-file.jpg Swift URL (a): site-lang-media-thumb-ab/archive/a/ab/file.jpg/120px-file.jpg Swift URL (b): site-lang-media-thumb-ab/archive/file.jpg/120px-file.jpg I think (a) makes more sense. rewrite.py currently either hashes the container or drops the hash entirely - similar to (b) but removing a/ab/ even if the container is not hashed.
Oh, I wasn't accounting for hashing in the post above. I was just using the conceptual container names. Yes, it will use (a), that is, keeping the hash dir in the path. Though it would not be "site-lang-media-thumb-ab", it will be "site-lang-media-thumb.ab".
Handled in r108596.