Last modified: 2014-10-12 18:49:36 UTC
On beta cluster we are making use of role::labs::lvm::mnt to allocate the instance disk space to /mnt . I created a new Trusty instance deployment-cxserver02.eqiad.wmflabs which has a puppet failure: Notice: /Stage[main]/Labs_lvm/Package[lvm2]/ensure: ensure changed 'purged' to 'present' Notice: /Stage[main]/Labs_lvm/File[/usr/local/sbin/make-instance-vg]/ensure: defined content as '{md5}d427b6b327e1e4f7f5c8c436eb5667a6' Notice: /Stage[main]/Labs_lvm/Exec[create-volume-group]/returns: Error: Can't create any more partitions. /usr/local/sbin/make-instance-vg: failed to create new partition Error: /usr/local/sbin/make-instance-vg '/dev/vda' returned 1 instead of one of [0] Error: /Stage[main]/Labs_lvm/Exec[create-volume-group]/returns: change from notrun to 0 failed: /usr/local/sbin/make-instance-vg '/dev/vda' returned 1 instead of one of [0] Notice: /Stage[main]/Labs_lvm/File[/usr/local/sbin/make-instance-vol]/ensure: defined content as '{md5}24f46d3c5b16cbf0cca687b84a81b3cd' Notice: /Stage[main]/Role::Labs::Lvm::Mnt/Labs_lvm::Volume[second-local-disk]/Exec[create-vd-second-local-disk]: Dependency Exec[create-volume-group] has failures: true Warning: /Stage[main]/Role::Labs::Lvm::Mnt/Labs_lvm::Volume[second-local-disk]/Exec[create-vd-second-local-disk]: Skipping because of failed dependencies Using parted on /dev/vda there is still 42GB of free space: # /sbin/parted /dev/vda print free Model: Virtio Block Device (virtblk) Disk /dev/vda: 42.9GB Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 32.3kB 8191MB 8191MB primary ext4 > 8191MB 8191MB 476kB Free Space 2 8191MB 10.2GB 2048MB primary ext4 3 10.2GB 12.3GB 2048MB primary ext4 4 12.3GB 14.3GB 2048MB primary linux-swap(v1) > 14.3GB 42.9GB 28.6GB Free Space Could it be an issue in make-instance-vg? It invokes: /sbin/parted /dev/vda print free|fgrep 'Free Space'|tail -n 1|sed -e 's/ */ /g'|cut -d ' ' -f 2,3 Which returns: 14.3GB 42.9GB And thus invoke: /sbin/parted -s /dev/vda mkpart primary 14.3GB 42.9GB The bug prevent us from finalizing the creation of deployment-cxserver02.eqiad.wmflabs which is the content translation server on the beta cluster (bug 71783).
*** Bug 71874 has been marked as a duplicate of this bug. ***
The same thing has been happening since last week on integration-slave1009. I never got to fully set up that new instance because it's puppet run failed from the very beginning.
The new partitioning recipe uses all four possible primary partitions, leaving none for LVM to take. I'll fix the image generation today, but instances created with that recipe cannot use LVM without manual tweaking.
Dammit, that new image is giving me no end of trouble. Marc, you're going to set up the new image to use lvm for all its partitions?
For the record, I initially thought this was only affecting Trusty instances, but it's affecting Precise images as well. integration-slave1004 (newly created, Precise) is unable to complete its puppet run properly also with the same error.
Change 166139 had a related patch set uploaded by coren: Labs: Make images create LVM at firstboot.sh https://gerrit.wikimedia.org/r/166139
Change 166139 merged by Andrew Bogott: Labs: Make images create LVM at firstboot.sh https://gerrit.wikimedia.org/r/166139
This should now be resolved for new instances. Note that the new partition scheme will make the /mnt partition slightly smaller as /var/log now has its own 2gb partition.