Last modified: 2014-10-19 23:30:03 UTC
I get a problem on my labs-vagrant instances mlp and math-prview while running sudo puppet agent -tv The error is https://gist.github.com/physikerwelt/f3e85373d1452ce7709f
after manually running rm -rf /srv/vagrant one puppet run can be executed without problems. However, another round causes the same problem again.
This is caused by a partial/failed migration of files from the legacy /mnt/vagrant install location to the new /srv/vagrant location. The instances are now in a partially migrated state which the puppet provisioning scripts are not equipped to correct. Renaming /mnt/vagrant will stop puppet from attempting further migrations. If the instances are working as expected, the /mnt/vagrant directory and its contents can be removed entirely. If not, a manual migration from the old location to the new location may be attempted: /bin/mkdir "/srv/vagrant" && (cd "/mnt/vagrant"; /bin/tar cf - .) | (cd "/srv/vagrant"; /bin/tar xf -) && /bin/rm -rf "/mnt/vagrant" This is the script that puppet itself is attempting to execute. The presence of /mnt/vagrant following a puppet run indicates that this command pipeline is failing for some reason. Without logging output from the failed run I can only speculate as to the cause of the failure.
(In reply to Bryan Davis from comment #2) > This is caused by a partial/failed migration of files from the legacy > /mnt/vagrant install location to the new /srv/vagrant location. The > instances are now in a partially migrated state which the puppet > provisioning scripts are not equipped to correct. Renaming /mnt/vagrant will > stop puppet from attempting further migrations. > > If the instances are working as expected, the /mnt/vagrant directory and its > contents can be removed entirely. If not, a manual migration from the old > location to the new location may be attempted: > > /bin/mkdir "/srv/vagrant" && > (cd "/mnt/vagrant"; /bin/tar cf - .) | > (cd "/srv/vagrant"; /bin/tar xf -) && > /bin/rm -rf "/mnt/vagrant" > > This is the script that puppet itself is attempting to execute. The presence > of /mnt/vagrant following a puppet run indicates that this command pipeline > is failing for some reason. Without logging output from the failed run I can > only speculate as to the cause of the failure. Can you try to log into mlp.eqiad.wmflabs? There is nothing else installed on this brand new instance.
Investigation on mlp.eqiad.wmflabs shows: $ mount|grep vd-second--local--disk /dev/mapper/vd-second--local--disk on /mnt type ext4 (rw) /dev/mapper/vd-second--local--disk on /srv type ext4 (rw) So on this instance, the same disk is mounted at both /mnt and /srv. The instance seems to only have role::labs::vagrant applied via puppet. That role requires role::labs::lvm::srv which would provision and mount /srv. It is not immediately obvious to me what would have added /mnt to /etc/fstab. The duplicate mount is the source of the error. Initially neither /srv/vagrant nor /mnt/vagrant exist. Puppet then provisions /srv/vagrant as a git clone of mediawiki/vagrant. On a subsequent puppet run, puppet checks for the existence of /mnt/vagrant and finds it because the /srv and /mnt locations both mount the same disk. This causes puppet to run the shell script that was meant to migrate older labs_vagrant installs from the primary disk to the lvm volume mounted on /srv. This script then fails because the /srv/vagrant target directory already exists. I manually unmounted /mnt and removed the mount description for it from /etc/fstab. Then I forced a puppet run with debug level logging. /mnt was not remounted nor was it re-added to /etc/fstab. Looking through the instance configuration history, I think the likely cause of this "interesting" configuration was that role::labs::lvm::mnt was manually applied to the instance <https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource:I-000006ac.eqiad.wmflabs&diff=131469&oldid=131468> and then subsequently removed <https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource:I-000006ac.eqiad.wmflabs&diff=next&oldid=131469>. Since removing a role does not remove any configuration that was applied by the role, this left the /etc/fstab line to mount /dev/vd/second-local-disk. The later application of role::labs::vagrant <https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource:I-000006ac.eqiad.wmflabs&diff=next&oldid=131471> added a second /etc/fstab entry to mount the same /dev/vd/second-local-disk partition on /srv.