Last modified: 2014-10-10 19:11:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T73741, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 71741 - WMFLabs: New instances with precise image are broken (puppet run fails, no ssh access possible)
WMFLabs: New instances with precise image are broken (puppet run fails, no ss...
Status: RESOLVED WORKSFORME
Product: Wikimedia Labs
Classification: Unclassified
Infrastructure (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-10-07 09:26 UTC by Krinkle
Modified: 2014-10-10 19:11 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Krinkle 2014-10-07 09:26:17 UTC
Creating a new instance with the precise image fails and leaves the instance inaccessible from ssh.

I wanted to create an additional integration-slave running Precise to scale out or Jenkins pool, but it failed to provision properly.

https://wikitech.wikimedia.org/w/index.php?title=Special:NovaInstance&action=consoleoutput&project=integration&instanceid=b65a604d-40ef-4b16-b527-bfb862ca3904&region=eqiad


Oct  7 08:54:09 integration-slave1004 puppet-agent[981]: Enabling Puppet.
Oct  7 08:54:09 integration-slave1004 puppet-agent[773]: Could not request certificate: getaddrinfo: Name or service not known
Oct  7 08:54:10 integration-slave1004 puppet-agent[932]: Could not request certificate: getaddrinfo: Name or service not known
Oct  7 08:55:11 integration-slave1004 nslcd[901]: [b0dc51] <group/member="root"> ldap_start_tls_s() failed: Can't contact LDAP server: Connection timed out (uri="ldap://virt0.wikimedia.org:389")
Oct  7 08:55:11 integration-slave1004 nslcd[901]: [b0dc51] <group/member="root"> failed to bind to LDAP server ldap://virt0.wikimedia.org:389: Can't contact LDAP server: Connection refused
Oct  7 08:55:11 integration-slave1004 nslcd[901]: [334873] <group/member="root"> ldap_start_tls_s() failed: Can't contact LDAP server: Connection timed out (uri="ldap://virt0.wikimedia.org:389")
Oct  7 08:55:11 integration-slave1004 nslcd[901]: [334873] <group/member="root"> failed to bind to LDAP server ldap://virt0.wikimedia.org:389: Can't contact LDAP server: Connection timed out
Oct  7 08:55:12 integration-slave1004 nslcd[901]: [b0dc51] <group/member="root"> connected to LDAP server ldap://virt1000.wikimedia.org:389
Oct  7 08:55:12 integration-slave1004 nslcd[901]: [b0dc51] <group/member="root"> ldap_result() failed: No such object
Oct  7 08:55:12 integration-slave1004 nslcd[901]: [b0dc51] <group/member="root"> ldap_result() failed: No such object
Oct  7 08:55:13 integration-slave1004 nslcd[901]: [334873] <group/member="root"> connected to LDAP server ldap://virt1000.wikimedia.org:389
Oct  7 08:55:13 integration-slave1004 nslcd[901]: [334873] <group/member="root"> ldap_result() failed: No such object
Oct  7 08:55:13 integration-slave1004 nslcd[901]: [334873] <group/member="root"> ldap_result() failed: No such object
..
Oct  7 08:55:19 integration-slave1004 puppet-agent[1218]: Creating a new SSL key for i-00000670.eqiad.wmflabs
..
Oct  7 08:55:28 integration-slave1004 nslcd[1059]: [3c9869] <group(all)> ldap_start_tls_s() failed: Can't contact LDAP server: Connection timed out (uri="ldap://virt0.wikimedia.org:389")
Oct  7 08:55:28 integration-slave1004 nslcd[1059]: [3c9869] <group(all)> failed to bind to LDAP server ldap://virt0.wikimedia.org:389: Can't contact LDAP server: Connection timed out
Oct  7 08:55:29 integration-slave1004 nslcd[1059]: [3c9869] <group(all)> connected to LDAP server ldap://virt1000.wikimedia.org:389
Oct  7 08:55:29 integration-slave1004 nslcd[1059]: [3c9869] <group(all)> ldap_result() failed: No such object
Oct  7 08:55:29 integration-slave1004 nslcd[1059]: [7b23c6] <group/member="puppet"> ldap_start_tls_s() failed: Can't contact LDAP server: Connection timed out (uri="ldap://virt0.wikimedia.org:389")
Oct  7 08:55:29 integration-slave1004 nslcd[1059]: [7b23c6] <group/member="puppet"> failed to bind to LDAP server ldap://virt0.wikimedia.org:389: Can't contact LDAP server: Connection timed out
Oct  7 08:55:29 integration-slave1004 nslcd[1059]: [7b23c6] <group/member="puppet"> connected to LDAP server ldap://virt1000.wikimedia.org:389
Oct  7 08:55:29 integration-slave1004 nslcd[1059]: [7b23c6] <group/member="puppet"> ldap_result() failed: No such object
Oct  7 08:55:29 integration-slave1004 nslcd[1059]: [7b23c6] <group/member="puppet"> ldap_result() failed: No such object
Oct  7 08:55:30 integration-slave1004 nslcd[1059]: [334873] <group/member="puppet"> ldap_start_tls_s() failed: Can't contact LDAP server: Connection timed out (uri="ldap://virt0.wikimedia.org:389")
Oct  7 08:55:30 integration-slave1004 nslcd[1059]: [334873] <group/member="puppet"> failed to bind to LDAP server ldap://virt0.wikimedia.org:389: Can't contact LDAP server: Connection timed out
Oct  7 08:55:30 integration-slave1004 nslcd[1059]: [334873] <group/member="puppet"> connected to LDAP server ldap://virt1000.wikimedia.org:389
Oct  7 08:55:30 integration-slave1004 nslcd[1059]: [334873] <group/member="puppet"> ldap_result() failed: No such object
Oct  7 08:55:30 integration-slave1004 nslcd[1059]: [334873] <group/member="puppet"> ldap_result() failed: No such object
Oct  7 08:55:33 integration-slave1004 nslcd[1059]: [b0dc51] <group/member="puppet"> ldap_start_tls_s() failed: Can't contact LDAP server: Connection timed out (uri="ldap://virt0.wikimedia.org:389")
Oct  7 08:55:33 integration-slave1004 nslcd[1059]: [b0dc51] <group/member="puppet"> failed to bind to LDAP server ldap://virt0.wikimedia.org:389: Can't contact LDAP server: Connection timed out
Oct  7 08:55:33 integration-slave1004 nslcd[1059]: [b0dc51] <group/member="puppet"> connected to LDAP server ldap://virt1000.wikimedia.org:389
Oct  7 08:55:33 integration-slave1004 nslcd[1059]: [b0dc51] <group/member="puppet"> ldap_result() failed: No such object
Oct  7 08:55:33 integration-slave1004 nslcd[1059]: [b0dc51] <group/member="puppet"> ldap_result() failed: No such object
Oct  7 08:55:33 integration-slave1004 nslcd[1059]: [e8944a] <group/member="root"> ldap_result() failed: No such object
Oct  7 08:55:33 integration-slave1004 nslcd[1059]: [e8944a] <group/member="root"> ldap_result() failed: No such object
..
Oct  7 09:18:48 integration-slave1004 puppet-agent[932]: Could not request certificate: getaddrinfo: Temporary failure in name resolution
Comment 1 Antoine "hashar" Musso (WMF) 2014-10-07 09:30:27 UTC
I suspect the labs image for Ubuntu Precise hasn't been updated to take in account the recent LDAP changes (phasing out pmtpa / ldap renaming).   Seems to me the image need to be refreshed, for continuous integration purposes we still need Precise instances.
Comment 2 Andrew Bogott 2014-10-07 13:46:33 UTC
I just tested this a moment ago, and it worked fine for me.  I installed a new precise base image on Friday that uses the new ldap settings as well as including an updated bash and a separate /var/log partition.
Comment 3 Andrew Bogott 2014-10-07 15:53:13 UTC
OK -- that last comment was both right and wrong.

New instances /do/ work.  But there's still a smattering of virt0 and virt1000 references in them, which I am cleaning up.
Comment 4 Krinkle 2014-10-09 10:12:49 UTC
I don't think the ldap thing is the problem. The log I pasted in comment 0 shows that it tried both. It's failing for a different reason.
Comment 5 Andrew Bogott 2014-10-09 13:23:37 UTC
I just created new images last night which seem generally happier.  Try again?
Comment 6 Krinkle 2014-10-10 19:11:45 UTC
The existing instance was never fixed, but it seems to work fine for new instances indeed (assuming it's not a race condition). I'll nuke the instance and re-create it for now.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links