Last modified: 2014-05-23 20:28:08 UTC
Problem encountered by Ally on the Beta Cluster and reported on the glamtools mailing-list Quoting her email: ---- When I’ve uploaded larger files (2500px) to the Beta cluster, the image re-sizes to the smaller, 500px version whenever I try to view or download the image at full resolution, despite all indications being that the current version should be the larger format. I’m not sure whether this is an issue with the Beta cluster or with the Toolset, or if it only has to do with uploading subsequent versions of files. Example link: http://commons.wikimedia.beta.wmflabs.org/wiki/File:Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg ---- Full thumbnailing is possible though: http://upload.beta.wmflabs.org/wikipedia/commons/thumb/c/ca/Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg/2000px-Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg
*** Bug 65684 has been marked as a duplicate of this bug. ***
Its a stale entry in varnish cache http://upload.beta.wmflabs.org/wikipedia/commons/c/ca/Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg?break-the-cache gives you the real version. Unclear if this is an issue with gwtoolset not sending cache purges, issues with varnish being set up wrong on beta labs (has been the case before), intermittent htcp packet drop, etc
After some testing: *Does not appear to be a gwtoolset problem, but a beta cluster configuration problem. This issue will happen on the live site. *text varnishes (deployment-cache-text.*) seem to recieve htcp purges fine *upload varnishes (deployment-cache-upload02) do not seem to recieve any purges. I also tested for thumbnails as well as the original asset
> *Does not appear to be a gwtoolset problem, but a beta cluster configuration > problem. This issue will happen on the live site. If I get you correctly, you mean it will *not* happen on the live site, right? :)
The purge destination is configured in operations/mediawiki-config.git in wmf-config/squid-labs.php : $wgHTCPRouting = array( '|^https?://upload\.beta\.wmflabs\.org|' => array( 'host' => '10.68.17.51', # deployment-cache-upload02 'port' => 4827, ), ... I tried a manual purge with: http://upload.beta.wmflabs.org/wikipedia/commons/thumb/c/ca/Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg/800px-Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg Ie: mwdeploy@deployment-bastion:~$ mwscript purgeList.php http://upload.beta.wmflabs.org/wikipedia/commons/thumb/c/ca/Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg/800px-Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg Purging 1 urls Done! mwdeploy@deployment-bastion:~$ On varnish side: deployment-cache-upload02.eqiad.wmflabs 25 2014-05-23T17:56:19 0.000068188 127.0.0.1 -/204 0 PURGE http://upload.beta.wmflabs.org/wikipedia/commons/thumb/c/ca/Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg/800px-Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg - - - - vhtcpd $ Maybe gwtoolset hit some code which ends up not sending purge request? We might be able to find out the debug log if purge requests are logged there.
[mid-air collision] Hmm, vhtcpd seems to at least be getting some packets: http://ganglia.wmflabs.org/latest/graph_all_periods.php?c=deployment-prep&h=deployment-cache-upload02&r=hour&z=default&jr=&js=&st=1400866550&event=hide&ts=0&v=7816&m=vhtcpd_inpkts_sane&vl=pkts&ti=Sane%20packets&z=large Maybe varnish is rejecting the purge request or something. I couldn't find where the config files were for varnish on beta are. (In reply to Jean-Fred from comment #4) > > *Does not appear to be a gwtoolset problem, but a beta cluster configuration > > problem. This issue will happen on the live site. > > If I get you correctly, you mean it will *not* happen on the live site, > right? :) I think its extremely likely to be a configuration issue on beta cluster that would *not* affect the live sites. From what I understand the configuration of upload varnishes on beta differ from the real ones. My only basis for this claim though is previous experience with caching issues specific to beta cluster, and that the beta config seems rather different from the main config for the upload cache.
> On varnish side: > > deployment-cache-upload02.eqiad.wmflabs 25 2014-05-23T17:56:19 0.000068188 > 127.0.0.1 -/204 0 PURGE > http://upload.beta.wmflabs.org/wikipedia/commons/thumb/c/ca/ > Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg/800px- > Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg - - - - vhtcpd > $ > bawolff@Bawolff-L:~$ wget -S 'http://upload.beta.wmflabs.org/wikipedia/commons/thumb/c/ca/Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg/800px-Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg' --2014-05-23 15:12:42-- http://upload.beta.wmflabs.org/wikipedia/commons/thumb/c/ca/Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg/800px-Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg Resolving upload.beta.wmflabs.org (upload.beta.wmflabs.org)... 208.80.155.136 Connecting to upload.beta.wmflabs.org (upload.beta.wmflabs.org)|208.80.155.136|:80... connected. HTTP request sent, awaiting response... HTTP/1.1 200 OK Server: nginx/1.1.19 Content-Type: image/jpeg X-Powered-By: PHP/5.3.10-1ubuntu3.10+wmf1 X-Wikimedia-Thumb: http://10.68.16.16/w/thumb.php?f=Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg&width=800 X-Varnish: 435262255 435256949, 1820643391 1820643388 Via: 1.1 varnish, 1.1 varnish Content-Length: 114849 Accept-Ranges: bytes Date: Fri, 23 May 2014 18:12:06 GMT Age: 5016 Connection: keep-alive X-Cache: deployment-cache-upload02 hit (12), deployment-cache-upload02 frontend hit (1) Access-Control-Allow-Origin: * Access-Control-Expose-Headers: Age, Content-Length, Date, X-Cache, X-Varnish Length: 114849 (112K) [image/jpeg] Saving to: `800px-Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg' 100%[======================================>] 114,849 541K/s in 0.2s 2014-05-23 15:12:42 (541 KB/s) - `800px-Imaginative_depiction_of_the_completed_Forth_Rail_Bridge.jpg' saved [114849/114849] In particular note the Age header ( Age: 5016 ). Unless you did that purge 83 minutes ago, it didn't work. I suspect there's some acl in the varnish config discarding the purge (That's a total guess) > > Maybe gwtoolset hit some code which ends up not sending purge request? We > might be able to find out the debug log if purge requests are logged there. Not just gwtoolset. Normal mediawiki purging wasn't working either. Purge requests are sent to the debug log via wfDebugLog with the 'squid' log name.
Purging the upload.beta.wmflabs.org URLs definitely send PURGE requests to varnish. I double checked it via varnishncsa on both frontend and backend cache. For some reason, it doesn't see to be purge since I get hits: X-Cache: deployment-cache-upload02 hit (16), deployment-cache-upload02 frontend hit (3) I have absolutely no clue. Will need to poke ops about it I guess.
Changing title to reflect the HTCP part of purging is working fine, its what varnish does with the purges that is apparently the problem