Last modified: 2014-11-20 16:59:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T65296, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 63296 - puppet labsstatus not reported when using role::puppet::self
puppet labsstatus not reported when using role::puppet::self
Status: RESOLVED WONTFIX
Product: Wikimedia Labs
Classification: Unclassified
Infrastructure (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Nobody - You can work on this!
: ops
Depends on:
Blocks: 67333
  Show dependency treegraph
 
Reported: 2014-03-31 10:16 UTC by Antoine "hashar" Musso (WMF)
Modified: 2014-11-20 16:59 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Antoine "hashar" Musso (WMF) 2014-03-31 10:16:39 UTC
The list of instances in OpenStack Manager has a column listing the status of puppet run on each instance.

On projects deployment-prep and integration, we are now using our own puppetmasters (respectively deployment-salt.eqiad.wmflabs and integration-puppetmaster.eqiad.wmflabs).  All instances ends up having a 'stale' status.

I have no idea how the status is generated and collected, would be very nice to have a way to report the status from independent puppet masters.
Comment 1 Andrew Bogott 2014-06-30 22:06:55 UTC
The check is easy, it's done by a reporter that's on every puppetmaster.  Currently, though, that fact is relayed to wikitech via nova instance metadata, which means that it can only get reported properly if the puppetmaster has Openstack auth.  Which self-hosted instances generally don't.
Comment 2 Antoine "hashar" Musso (WMF) 2014-06-30 22:28:35 UTC
Can we have a custom reporter to output the status of each nodes locally in a flat file?  We could then fetch and render it somewhere.
Comment 3 Andrew Bogott 2014-06-30 22:30:48 UTC
Yes, be my guest :)  The current reporter is in modules/puppetmaster/lib/puppet/reports, it's pretty straightforward.

Note, though, that we're in the process of adding some better icinga puppet reporting as well, so it might be best to just get that working for labs instances.
Comment 4 Bryan Davis 2014-07-02 15:01:15 UTC
Getting reporting into wikitech working may be as easy as setting `report_server` in /etc/puppet/puppet.conf to point to the labs master server when ::puppet::self::* is applied to a labs node.

A slightly more complex approach would be to add some endpoint on a secure server that can talk to wikitech directly and use it to proxy reports sent by ::puppet::self::master nodes. The reporter used by the labs masters (modules/puppetmaster/lib/puppet/reports/labsstatus.rb) seems to only need the project name (eg deployment-prep) and hostname (eg i-0000010b.eqiad.wmflabs) along with the puppet run status and timestamp to update wikitech. The advantage of this method of reporting would be that the ::puppet::self::master would still have access to the full report from each host that it serves to allow additional custom reporting as desired.
Comment 5 Andrew Bogott 2014-07-02 15:51:20 UTC
> Getting reporting into wikitech working may be as easy as setting
> `report_server` in /etc/puppet/puppet.conf to point to the labs master server 
> when ::puppet::self::* is applied to a labs node.

It would shock me if that worked because, y'know, auth.

But, as I said, best not to get too deep into this as this will (I hope) be handled using proper monitoring tools shortly.
Comment 6 Bryan Davis 2014-07-02 16:06:49 UTC
(In reply to Andrew Bogott from comment #5)
> > Getting reporting into wikitech working may be as easy as setting
> > `report_server` in /etc/puppet/puppet.conf to point to the labs master server 
> > when ::puppet::self::* is applied to a labs node.
> 
> It would shock me if that worked because, y'know, auth.

Confirmed:

Error: Could not send report: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [self signed certificate in certificate chain for /CN=Puppet CA: virt1000.wikimedia.org]

 
> But, as I said, best not to get too deep into this as this will (I hope) be
> handled using proper monitoring tools shortly.

Will the icinga monitoring that you've been working on be usable in labs? Or are you referring to the work that Yuvi is doing towards collecting information on the puppet runs with diamond? I'd be happy with either, but it would be nice to know which gerrit patches to follow. :)
Comment 7 Antoine "hashar" Musso (WMF) 2014-11-20 16:59:36 UTC
The reason I filled this bug was to use OpenstackManager as a dashboard of puppet run. Over the last few weeks, Yuvi had puppet runs reported to graphite and monitored using Shinken: http://shinken.wmflabs.org/problems which fulfill the use case I have.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links