Last modified: 2013-07-31 14:36:07 UTC
I've enabled ULS on all deployment-prep projects, but ?action=purge and anything else I can think of is ineffective at getting the ULS trigger to show up. Tried on: * http://commons.wikimedia.beta.wmflabs.org/wiki/Main_Page * http://en.wikipedia.beta.wmflabs.org/wiki/Main_Page
Just setting priority (to High) as this is a pretty important one for ULS testing.
I do not really have anytime right now to have a look at it. If one wants to investigate: - text cache is a Squid on deployment-squid instance - logs are in /var/log/squid (need to be a project sysadmin to look at them) - Mediawiki logs written on a NFS server. They can be read by connecting on deployment-bastion and looking under /data/project/logs. From there one could look at the debug log or use eval.php via: mwscript eval.php --wiki=enwiki
The SquidPurgeClient class has a log() method which uses the 'squid' logging group. Might want to enable logging for that debug group on beta.
Related URL: https://gerrit.wikimedia.org/r/66073 (Gerrit Change I25e7e77c8b3d3e5dbf8ce4bc9f6bd8ca8aa22d1c)
This issue seems to be rapidly gaining attention after having had a high priority and no action for almost a month. Involved are WMF QA and LangEng
Tried again tonight: $ mwscript purgeList.php --wiki=enwiki http://deployment.wikimedia.beta.wmflabs.org/wiki/Main_Page Purging 1 urls Done! $ On the squid size, the /var/log/squid/squid.log has: 2013/06/28 21:29:32| Parser: retval 1: from 0->31: method 0->4; url 6->20; version 22->30 (1/1) 2013/06/28 21:29:32| The request PURGE http://deployment.wikimedia.beta.wmflabs.org:80/wiki/Main_Page is ALLOWED, because it matched 'web' 2013/06/28 21:29:32| storeLocateVary: accept-encoding=, cookie= 2013/06/28 21:29:32| storeLocateVaryRead: MATCH! 7774E3A88F0B8BB19DE820258F110479 (null) 2013/06/28 21:29:32| clientCacheHit: Vary detected! 2013/06/28 21:29:32| clientProcessVary: HIT key=7774E3A88F0B8BB19DE820258F110479 etag=NONE 2013/06/28 21:29:32| The reply for PURGE http://deployment.wikimedia.beta.wmflabs.org/wiki/Main_Page is ALLOWED, because it matched 'all' 2013/06/28 21:29:32| storeLocateVaryCallback: DONE So it is definitely receiving the PURGE request
I did curl requests on the MainPage using : 1) curl -I 2) curl -I -H 'Accept-Encoding:gzip,deflate' Then did an edit and tried again both curl requests. The one without accept-encoding got properly purged, the compressed one did not get purged. Will attach full headers
Created attachment 12710 [details] curl requests with and without Accept-Encoding: gzip, deflate
I have migrated the text cache from squid to varnish. Mark Bergsma did the puppet changes a while back, just add to change the public IP for *.beta.wmflabs.org to point to the new instance deployment-cache-text1.pmtpa.wmflabs. Purge is definitely not working there. There is an access list that only allow purges from 127.0.0.1 whereas in beta they will be sent by application servers.
I got rid of the old $wgSquidServers on beta (https://gerrit.wikimedia.org/r/71348) and replaced that with HTCP multicast routing feature: purge requests are sent to a host depending on some regex rules. The change is https://gerrit.wikimedia.org/r/71345
Still have to send the PURGE requests to both text and mobile caches :( That is not supported by MediaWiki right now.
That needs $wgHTCPMulticastRouting to be able to send purges to several IP / groups. https://gerrit.wikimedia.org/r/#/c/71597/
I found out last week that the resource loader url (load.php) was pointing to the text cache ( en.wikipedia.beta.wmflabs.org/w/load.php ) instead of bits ( bits.beta.wmflabs.org/en.wikipedia.beta.wmflabs.org/w/load.php ). When resourceloader cache is unvalidated, there is no purge sent to the text cache so we had an old Javascript version being delivered. That most probably caused the issue reported there. I have deployed a change on beta a few minutes ago that points load.php to bits.beta.wmflabs.org : https://gerrit.wikimedia.org/r/#/c/70322/ . I guess that solve it. So at least we have proper cache invalidation for bits material :-]
This is the never ending mess. deployment-cache-text1 has a vhtcpd purge daemon running with the following options: cat /etc/default/vhtcpd DAEMON_OPTS="-F -m 239.128.0.112 -c 127.0.0.1:80 -c 127.0.0.1:3128" The -m is a multicast address to subscribe to. Since beta does not have multicast, we need vhtcpd to handle request send over unicast. The daemon does listen on udp: # netstat -ulnp|grep vhtcpd udp 0 0 0.0.0.0:4827 0.0.0.0:* 21339/vhtcpd Filling another bug.
vhtcpd was not an issue (bug 51874).
(In reply to comment #12) > That needs $wgHTCPMulticastRouting to be able to send purges to several IP / > groups. https://gerrit.wikimedia.org/r/#/c/71597/ Merged. The configuration https://gerrit.wikimedia.org/r/#/c/76918/ does send purge requests to both text and mobile caches. Seems to fix the remaining issue we had.