Last modified: 2013-10-14 10:15:46 UTC
my automation tests running on betalabs are failing because it says the "page is not connected to the server" url: en.m.wikipedia.wmflabs.org
The hostname should be en.m.wikipedia.beta.wmflabs.org by the way - however it's also inaccessible.
DNS entry *.m.wikipedia.beta.wmflabs.org points to 208.80.153.143 which is bound to deployment-cache-mobile01. From https://wikitech.wikimedia.org/wiki/Special:NovaAddress The varnish frontend (which listen on port 80) is not responding: hashar@deployment-cache-mobile01:~$ sudo /etc/init.d/varnish-frontend status * varnishd-frontend is not running hashar@deployment-cache-mobile01:~$ sudo /etc/init.d/varnish status * varnishd is running hashar@deployment-cache-mobile01:~$ It is not loading anymore: hashar@deployment-cache-mobile01:~$ sudo /etc/init.d/varnish-frontend start * Starting HTTP accelerator [fail] Message from VCC-compiler: Could not load module netmapper /usr/lib/x86_64-linux-gnu/varnish/vmods/libvmod_netmapper.so /usr/lib/x86_64-linux-gnu/varnish/vmods/libvmod_netmapper.so: cannot open shared object file: No such file or directory ('zero.inc.vcl' Line 3 Pos 8) import netmapper; -------#########- Running VCC-compiler failed, exit 1 VCL compilation failed hashar@deployment-cache-mobile01:~$
varnish has to be manually upgraded on the beta cluster instances. I have ran apt-get dist-upgrade on all four instances (deployment-cache-text1 deployment-cache-upload04 deployment-cache-bits03 deployment-cache-mobile01 ) and rebooted them. http://en.m.wikipedia.beta.wmflabs.org/ is back up :-]
Some varnish packages were not up-to-date and I had to purge some configuration files to have puppet regenerate them properly. All four instances now have puppet running without any error and varnish / varnishhtcpd are running on all of them.