Last modified: 2014-04-16 20:38:27 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T64082, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 62082 - udp2log and/or demux.py filename corruption
udp2log and/or demux.py filename corruption
Status: NEW
Product: Analytics
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Nobody - You can work on this!
:
Depends on: 64016
Blocks:
  Show dependency treegraph
 
Reported: 2014-02-28 22:24 UTC by Bryan Davis
Modified: 2014-04-16 20:38 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bryan Davis 2014-02-28 22:24:21 UTC
Seen on fluorine starting sometime 2014-02-28:

[07:28]  < springle>	 lots of odd things in fluorine:/a/mw-log
[07:28]  <AaronSchulz>	 yeah I saw that
[07:28]  <AaronSchulz>	 happens every blue moon

Numerous log files with names like:

    0180.log
    0.log
    #100206.log
    15):.log
    (278):.log
    bileContext.php(278):.log
    Context.php(278):.log
    eContext.php(278):.log
    ext.php(278):.log

Possibly more interesting are the files with names that are partial of an expected log or an expected log with some portion of 'fatal' prefixed to them:

    al.log
    atal.log
    faapi.log
    faCirrusSearch-all.log
    fapi.log
    farunJobs.log
    fataapi.log
    fatalapache2.log
    fatalapi.log
    fatalCirrusSearch-all.log
    fatalrunJobs.log
    fatalxff.log
    fatamemcached-serious.log
    fatapi.log
    fatarunJobs.log
    fatatestwiki.log
    fataxff.log
    fatCirrusSearch-all.log
    fatpoolcounter.log
    fatrunJobs.log
    fatxff.log
    faxff.log
    fCirrusSearch-all.log
    fmemcached-serious.log
    frunJobs.log
    fxff.log

Because of the 'f', 'fa', 'fat', ... and various logs that are named with parts of a stack trace from MobileContext, this seems likely to be related to Bug 62078 that is causing 39,289 frame stacktraces to be recorded in the fatal log.
Comment 1 Bingle 2014-02-28 22:35:35 UTC
Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1464
Comment 2 Toby Negrin 2014-03-02 02:30:42 UTC
Why is this analytics? Do we own this machine?

thanks,

-Toby
Comment 3 Bryan Davis 2014-03-02 04:32:58 UTC
(In reply to Toby Negrin from comment #2)
> Why is this analytics? Do we own this machine?

Greg and I guessed that analytics was the right component to file the bug under because the udp2log application is in the analytics/udplog.git gerrit repository.
Comment 4 Toby Negrin 2014-03-03 15:04:29 UTC
Yes -- makes sense. We'll take a look.

-Toby
Comment 5 Toby Negrin 2014-03-03 15:15:08 UTC
We'll prioritize for next sprint (Thursday 3/6)

-Toby
Comment 6 Toby Negrin 2014-03-06 17:01:49 UTC
Aaron/Bryan -- is this a serious issue? We expect to phase this technology out in the near future and this bug will require some investigation.

thanks,

-Toby
Comment 7 Bryan Davis 2014-03-06 17:47:50 UTC
(In reply to Toby Negrin from comment #6)
> Aaron/Bryan -- is this a serious issue? We expect to phase this technology
> out in the near future and this bug will require some investigation.

Its probably not an urgent problem. It is pretty annoying/disruptive when it occurs as it makes the logs on florine very hard to follow and some monitoring tools untrustworthy. It doesn't seem to happen frequently at this point however.

Out of curiosity, what is udp2log going to be replaced with? Kafka everywhere?
Comment 8 Greg Grossmeier 2014-03-14 15:50:40 UTC
(In reply to Bryan Davis from comment #7)
> Out of curiosity, what is udp2log going to be replaced with? Kafka
> everywhere?

Ping on that :)


Also, it happened again last night (3/13) due to a huge MobileFrontend backtrace. MaxSem fixed the part in MobileFrontend, but udp2log.py is still vulnerable to these issues. See the short thread on engineering@ "Strange log files on fluorine".
Comment 9 Toby Negrin 2014-03-17 21:38:43 UTC
udp2log will be replaced by Kafka at some point -- hopefully we are talking about a few months. All of the logs will be copied back to VA for analysis.

We do have a kafka to UDP2Log converter for "legacy" apps.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links