Last modified: 2014-10-12 18:49:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T73873, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 71873 - role::labs::lvm::mnt ends up with make-instance-vg: failed to create new partition
role::labs::lvm::mnt ends up with make-instance-vg: failed to create new part...
Status: RESOLVED FIXED
Product: Wikimedia Labs
Classification: Unclassified
Infrastructure (Other open bugs)
unspecified
All All
: Normal critical
: ---
Assigned To: Marc A. Pelletier
:
: 71874 (view as bug list)
Depends on:
Blocks: 71783
  Show dependency treegraph
 
Reported: 2014-10-09 10:03 UTC by Antoine "hashar" Musso (WMF)
Modified: 2014-10-12 18:49 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Antoine "hashar" Musso (WMF) 2014-10-09 10:03:13 UTC
On beta cluster we are making use of role::labs::lvm::mnt to allocate the instance  disk space to /mnt .

I created a new Trusty instance deployment-cxserver02.eqiad.wmflabs which has a puppet failure:

Notice: /Stage[main]/Labs_lvm/Package[lvm2]/ensure:
  ensure changed 'purged' to 'present'
Notice: /Stage[main]/Labs_lvm/File[/usr/local/sbin/make-instance-vg]/ensure:
  defined content as '{md5}d427b6b327e1e4f7f5c8c436eb5667a6'

Notice: /Stage[main]/Labs_lvm/Exec[create-volume-group]/returns:
  Error: Can't create any more partitions.
 /usr/local/sbin/make-instance-vg: failed to create new partition
Error: /usr/local/sbin/make-instance-vg '/dev/vda' returned 1 instead of one of [0]

Error: /Stage[main]/Labs_lvm/Exec[create-volume-group]/returns: change from notrun to 0 failed: /usr/local/sbin/make-instance-vg '/dev/vda' returned 1 instead of one of [0]

Notice: /Stage[main]/Labs_lvm/File[/usr/local/sbin/make-instance-vol]/ensure: defined content as '{md5}24f46d3c5b16cbf0cca687b84a81b3cd'
Notice: /Stage[main]/Role::Labs::Lvm::Mnt/Labs_lvm::Volume[second-local-disk]/Exec[create-vd-second-local-disk]: Dependency Exec[create-volume-group] has failures: true
Warning: /Stage[main]/Role::Labs::Lvm::Mnt/Labs_lvm::Volume[second-local-disk]/Exec[create-vd-second-local-disk]: Skipping because of failed dependencies


Using parted on /dev/vda there is still 42GB of free space:

  # /sbin/parted /dev/vda print free
  Model: Virtio Block Device (virtblk)
  Disk /dev/vda: 42.9GB
  Sector size (logical/physical): 512B/512B
  Partition Table: msdos
  
  Number  Start   End     Size    Type     File system     Flags
   1      32.3kB  8191MB  8191MB  primary  ext4
>          8191MB  8191MB  476kB            Free Space
   2      8191MB  10.2GB  2048MB  primary  ext4
   3      10.2GB  12.3GB  2048MB  primary  ext4
   4      12.3GB  14.3GB  2048MB  primary  linux-swap(v1)
>          14.3GB  42.9GB  28.6GB           Free Space

Could it be an issue in make-instance-vg?  It invokes:

 /sbin/parted /dev/vda print free|fgrep 'Free Space'|tail -n 1|sed -e 's/  */ /g'|cut -d ' ' -f 2,3

Which returns:

 14.3GB 42.9GB

And thus invoke:

/sbin/parted -s /dev/vda mkpart primary  14.3GB 42.9GB



The bug prevent us from finalizing the creation of deployment-cxserver02.eqiad.wmflabs which is the content translation server on the beta cluster (bug 71783).
Comment 1 Krinkle 2014-10-09 10:19:14 UTC
*** Bug 71874 has been marked as a duplicate of this bug. ***
Comment 2 Krinkle 2014-10-09 10:20:03 UTC
The same thing has been happening since last week on integration-slave1009. I never got to fully set up that new instance because it's puppet run failed from the very beginning.
Comment 3 Marc A. Pelletier 2014-10-10 12:37:15 UTC
The new partitioning recipe uses all four possible primary partitions, leaving none for LVM to take.  I'll fix the image generation today, but instances created with that recipe cannot use LVM without manual tweaking.
Comment 4 Andrew Bogott 2014-10-10 14:37:05 UTC
Dammit, that new image is giving me no end of trouble.  Marc, you're going to set up the new image to use lvm for all its partitions?
Comment 5 Krinkle 2014-10-10 20:50:56 UTC
For the record, I initially thought this was only affecting Trusty instances, but it's affecting Precise images as well. integration-slave1004 (newly created, Precise) is unable to complete its puppet run properly also with the same error.
Comment 6 Gerrit Notification Bot 2014-10-10 21:15:09 UTC
Change 166139 had a related patch set uploaded by coren:
Labs: Make images create LVM at firstboot.sh

https://gerrit.wikimedia.org/r/166139
Comment 7 Gerrit Notification Bot 2014-10-10 21:35:05 UTC
Change 166139 merged by Andrew Bogott:
Labs: Make images create LVM at firstboot.sh

https://gerrit.wikimedia.org/r/166139
Comment 8 Andrew Bogott 2014-10-12 18:49:36 UTC
This should now be resolved for new instances.  Note that the new partition scheme will make the /mnt partition slightly smaller as /var/log now has its own 2gb partition.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links