Last modified: 2013-12-20 21:40:30 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T60764, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 58764 - Some lines in zero, and mobile logs start within other log line (instead of beginning of line)
Some lines in zero, and mobile logs start within other log line (instead of b...
Status: RESOLVED FIXED
Product: Analytics
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: christian
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-12-20 21:40 UTC by christian
Modified: 2013-12-20 21:40 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description christian 2013-12-20 21:40:00 UTC
Since 2013-11-13 we have been seeing lines like

  cp1046.eqiad.wmnet	1008323724	201cp3011.esams.wikimedia.org	6978832710	2013-11-15T19:29:25 [...]

in the mobile and zero logs. There, part of a log line (part past 201
in the third column) got overwritten by a new line
(cp3011.esams[...]). All 16 fields of this new line gets written,
followed by a line feed. The next line is a normal log line again.

For each file we had between 5 to 15 such lines.

We did not see such lines for the sampled-1000 logs.
Comment 1 christian 2013-12-20 21:40:16 UTC
Checking for mangled lines which field got overwritten showed that the
probability of a field to get overwritten directly corresponded to the
field length.

~41% of the collisions happened for minutes 0, and 59.
~54% of the collisions happened for minutes 29, and 30.
So ~95% of the collisions happened at the times puppet is run.
Comment 2 christian 2013-12-20 21:40:30 UTC
It seems puppet injected a new udp2log filter configuration file and
restarted udp2log on oxygen (mobile and zero logs get written there)
upon some puppet runs. The culprit seems to be serializing the hash of
mobile hosts into the filter configuration without sorting them. So
upon each puppet run, the serialized string of mobile hosts stands a
chance of producing the hosts in a different order.

Since, commit 55ee2641ad2d74adf8b9d8353f63b87f6e007348 of the puppet
repo, the list gets sorted before serialization. Hence, the serialized
string should not change between puppet runs, if the mobile hosts do
not change.

After merging the above change, we do not see mangled log lines any
longer. Last mangled log lines are for 2013-12-16 17:30.

All other udp2log generated log files generated on oxygen have been
affected as well (e.g.: edit logs). But the problem is now fixed for
them as well.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links