Last modified: 2014-06-18 02:27:01 UTC
/var/log/apache2 and thus / is full on tools-webproxy: | root@tools-webproxy:~# du -h /var/log/apache2 | 6.2G /var/log/apache2 | root@tools-webproxy:~# df -h | Filesystem Size Used Avail Use% Mounted on | /dev/vda1 9.9G 9.4G 2.5M 100% / ^^^^ | udev 998M 8.0K 998M 1% /dev | tmpfs 401M 264K 401M 1% /run | none 5.0M 0 5.0M 0% /run/lock | none 1002M 0 1002M 0% /run/shm | /dev/vdb 20G 173M 19G 1% /mnt | labnfs.pmtpa.wmnet:/tools/project 10T 6.3T 3.8T 63% /data/project | labstore1.pmtpa.wmnet:/keys 36T 12T 25T 33% /public/keys | labnfs.pmtpa.wmnet:/tools/home 10T 6.3T 3.8T 63% /home | root@tools-webproxy:~# I've moved 1.6 GByte of old logs to /data/project/.system/logs/temp-refuge-from-webproxy/ (0600'ed to be on the safe side). On non-VM systems, I would create an extra partition for /var/log/apache2 to isolate this; I'm not sure what's the best way to proceed in Labs and Puppet. Probably we should increase the HD space for this instance as well. Even though I believe we don't use Icinga yet, tools-webproxy's listed there with "DISK OK" as current status (<http://icinga.wmflabs.org/cgi-bin/icinga/extinfo.cgi?type=2&host=tools-webproxy.pmtpa.wmflabs&service=Disk+Space>). This should be adjusted. The webservers have 2.1 GByte to 5.4 GByte free in / as of now, but isolating /var/log/apache2 there is probably prudent, too.
(In reply to comment #0) > [...] > Even though I believe we don't use Icinga yet, tools-webproxy's listed there > with "DISK OK" as current status > (<http://icinga.wmflabs.org/cgi-bin/icinga/extinfo.cgi?type=2&host=tools- > webproxy.pmtpa.wmflabs&service=Disk+Space>). > This should be adjusted. > [...] Ouch. I looked at Icinga *after* I cleared up some space. The "Availability Report" (<http://icinga.wmflabs.org/cgi-bin/icinga/avail.cgi?host=tools-webproxy.pmtpa.wmflabs&service=Disk+Space&show_log_entries>) shows that the service correctly was in warning and then error status.
(In reply to comment #0) > [...] > On non-VM systems, I would create an extra partition for /var/log/apache2 to > isolate this; I'm not sure what's the best way to proceed in Labs and > Puppet. > Probably we should increase the HD space for this instance as well. > [...] I should have opened my eyes first: The HD space is already provided as /dev/vdb and mounted at /mnt by default (apparently in the original image's /etc/fstab?): | scfc@tools-webproxy:~$ df -h /mnt | Filesystem Size Used Avail Use% Mounted on | /dev/vdb 20G 173M 19G 1% /mnt | scfc@tools-webproxy:~$ ls -al /mnt | total 24 | drwxr-xr-x 3 _lldpd nslcd 4096 Jan 21 2013 . | drwxr-xr-x 25 root root 4096 Dec 4 17:00 .. | drwx------ 2 root root 16384 Jan 21 2013 lost+found | scfc@tools-webproxy:~$ (My assumption was that the space would be given to /.) The permissions look faulty, though: | [tim@passepartout ~]$ ssh tools-login.pmtpa.wmflabs ls -dl /mnt | If you are having access problems, please see:https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances | drwxr-xr-x 6 root root 4096 Nov 21 21:45 /mnt | [tim@passepartout ~]$ ssh tools-webproxy.pmtpa.wmflabs ls -dl /mnt | If you are having access problems, please see:https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances | If you are having access problems, please see:https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances | drwxr-xr-x 3 _lldpd nslcd 4096 Jan 21 2013 /mnt | Killed by signal 1. | [tim@passepartout ~]$ I've chown'ed root.root it. So my plan of attack would be: 1. Move old logs from /var/log/apache2 to /data/project/.system/logs/temp-refuge-from-webproxy. 2. "service apache2 stop && mkdir -p /mnt/var/log && mv -i /var/log/apache2 /mnt/var/log/ && ln -s /mnt/var/log/apache2 /var/log/ && service apache2 start". 3. Move old logs back to /mnt/var/log/apache2 so that logrotate can expire them in the usual way.
(In reply to comment #2) > [...] > So my plan of attack would be: > 1. Move old logs from /var/log/apache2 to > /data/project/.system/logs/temp-refuge-from-webproxy. > 2. "service apache2 stop && mkdir -p /mnt/var/log && mv -i /var/log/apache2 > /mnt/var/log/ && ln -s /mnt/var/log/apache2 /var/log/ && service apache2 > start". > 3. Move old logs back to /mnt/var/log/apache2 so that logrotate can expire > them > in the usual way. Done that. I'll keep this bug open until I have puppetized the symlink.
Any update on this? Doesn't applying the 'biglogs' puppet class 'fix' the issue anyway?
(In reply to Yuvi Panda from comment #4) > Any update on this? Doesn't applying the 'biglogs' puppet class 'fix' the > issue anyway? Uh. Everything I wrote above is from before the move to eqiad and the introduction of the labs_lvm module. So, given that role::labs::lvm::biglogs *is* applied and extra space mounted at /var/log, I think this issue is fixed. We /could/ include role::labs::lvm::biglogs in role::labs::tools::proxy, but setting up disk space feels orthogonal to the proper proxy functionality, so I think keeping the two issues apart is good.