Last modified: 2014-10-19 23:30:03 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T74234, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 72234 - Exec['migrate legacy files'] failing repeatedly when both /mnt/vagrant and /srv/vagrant are present
Exec['migrate legacy files'] failing repeatedly when both /mnt/vagrant and /s...
Status: RESOLVED FIXED
Product: MediaWiki-Vagrant
Classification: Unclassified
labs-vagrant (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Bryan Davis
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-10-19 13:53 UTC by physikerwelt
Modified: 2014-10-19 23:30 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description physikerwelt 2014-10-19 13:53:05 UTC
I get a problem on my labs-vagrant instances mlp and math-prview while running sudo puppet agent -tv
The error is
https://gist.github.com/physikerwelt/f3e85373d1452ce7709f
Comment 1 physikerwelt 2014-10-19 13:59:56 UTC
after manually running rm -rf /srv/vagrant one puppet run can be executed without problems. However, another round causes the same problem again.
Comment 2 Bryan Davis 2014-10-19 20:30:49 UTC
This is caused by a partial/failed migration of files from the legacy /mnt/vagrant install location to the new /srv/vagrant location. The instances are now in a partially migrated state which the puppet provisioning scripts are not equipped to correct. Renaming /mnt/vagrant will stop puppet from attempting further migrations.

If the instances are working as expected, the /mnt/vagrant directory and its contents can be removed entirely. If not, a manual migration from the old location to the new location may be attempted:

  /bin/mkdir "/srv/vagrant" &&
  (cd "/mnt/vagrant"; /bin/tar cf - .) |
  (cd "/srv/vagrant"; /bin/tar xf -) &&
  /bin/rm -rf "/mnt/vagrant"

This is the script that puppet itself is attempting to execute. The presence of /mnt/vagrant following a puppet run indicates that this command pipeline is failing for some reason. Without logging output from the failed run I can only speculate as to the cause of the failure.
Comment 3 physikerwelt 2014-10-19 22:25:11 UTC
(In reply to Bryan Davis from comment #2)
> This is caused by a partial/failed migration of files from the legacy
> /mnt/vagrant install location to the new /srv/vagrant location. The
> instances are now in a partially migrated state which the puppet
> provisioning scripts are not equipped to correct. Renaming /mnt/vagrant will
> stop puppet from attempting further migrations.
> 
> If the instances are working as expected, the /mnt/vagrant directory and its
> contents can be removed entirely. If not, a manual migration from the old
> location to the new location may be attempted:
> 
>   /bin/mkdir "/srv/vagrant" &&
>   (cd "/mnt/vagrant"; /bin/tar cf - .) |
>   (cd "/srv/vagrant"; /bin/tar xf -) &&
>   /bin/rm -rf "/mnt/vagrant"
> 
> This is the script that puppet itself is attempting to execute. The presence
> of /mnt/vagrant following a puppet run indicates that this command pipeline
> is failing for some reason. Without logging output from the failed run I can
> only speculate as to the cause of the failure.

Can you try to log into mlp.eqiad.wmflabs? There is nothing else installed on this brand new instance.
Comment 4 Bryan Davis 2014-10-19 23:29:41 UTC
Investigation on mlp.eqiad.wmflabs shows:

$ mount|grep vd-second--local--disk
/dev/mapper/vd-second--local--disk on /mnt type ext4 (rw)
/dev/mapper/vd-second--local--disk on /srv type ext4 (rw)

So on this instance, the same disk is mounted at both /mnt and /srv. The instance seems to only have role::labs::vagrant applied via puppet. That role requires role::labs::lvm::srv which would provision and mount /srv. It is not immediately obvious to me what would have added /mnt to /etc/fstab.

The duplicate mount is the source of the error. Initially neither /srv/vagrant nor /mnt/vagrant exist. Puppet then provisions /srv/vagrant as a git clone of mediawiki/vagrant. On a subsequent puppet run, puppet checks for the existence of /mnt/vagrant and finds it because the /srv and /mnt locations both mount the same disk. This causes puppet to run the shell script that was meant to migrate older labs_vagrant installs from the primary disk to the lvm volume mounted on /srv. This script then fails because the /srv/vagrant target directory already exists.

I manually unmounted /mnt and removed the mount description for it from /etc/fstab. Then I forced a puppet run with debug level logging. /mnt was not remounted nor was it re-added to /etc/fstab.

Looking through the instance configuration history, I think the likely cause of this "interesting" configuration was that role::labs::lvm::mnt was manually applied to the instance <https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource:I-000006ac.eqiad.wmflabs&diff=131469&oldid=131468> and then subsequently removed <https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource:I-000006ac.eqiad.wmflabs&diff=next&oldid=131469>. Since removing a role does not remove any configuration that was applied by the role, this left the /etc/fstab line to mount /dev/vd/second-local-disk. The later application of role::labs::vagrant <https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource:I-000006ac.eqiad.wmflabs&diff=next&oldid=131471> added a second /etc/fstab entry to mount the same /dev/vd/second-local-disk partition on /srv.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links