Last modified: 2014-03-11 11:49:40 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T48197, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 46197 - Dump stats: automated validation of new monthly dump stats
Dump stats: automated validation of new monthly dump stats
Status: NEW
Product: Analytics
Classification: Unclassified
Wikistats (Other open bugs)
unspecified
All All
: Low normal
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-03-16 12:20 UTC by Erik Zachte
Modified: 2014-03-11 11:49 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Erik Zachte 2013-03-16 12:20:23 UTC
21 Sep, 2012 Erik Zachte:
new script to compare csv output from different months (and halt publishing if differences exceed threshold)

What I did since 21 Sep is this:
New reports are first published in draft folder
After manual vetting reports are generated in 26 languages and published in final folder
Comment 2 Erik Zachte 2014-03-11 11:49:40 UTC
Would be nice. But would be incomplete at best, and possibly a major task, depending on ambition level, as there are many metrics, and plausible rate of change could differ per project. It also could create lots of false positives, depending on thresholds chosen. 

BTW manual vetting is mostly quick comparison of key metrics on old and new reports for some wikis (mostly English Wikipedia) to see if these metrics are ballpark within expected range. Other than that many eyeballs keep Wikistats under scrutiny. 

Background: Several years ago there was a major bug that caused all article counts to be twice as high (redirects were not recognized, or something of that nature). Given that Wikistats regenerates all historic months on every run that gave the impression of a complete overhaul of Wikistats methodology and caused some turmoil. Given the high stability of the wikistats scripts (after 10 years of operation few operational errors has skewed stats) the fact that any software can fail in changing circumstances came as a surprise to some users.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links