Last modified: 2014-04-14 19:28:06 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T65879, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 63879 - Incomplete monthly aggregated page view files
Incomplete monthly aggregated page view files
Status: RESOLVED FIXED
Product: Analytics
Classification: Unclassified
Wikistats (Other open bugs)
unspecified
All All
: High normal
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-04-13 15:48 UTC by Erik Zachte
Modified: 2014-04-14 19:28 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Erik Zachte 2014-04-13 15:48:41 UTC
See folder http://dumps.wikimedia.org/other/pagecounts-ez/merged/

Monthly files for 2014-02 and 2014-03 are 1.7/2.1 GB instead of usual 4.5 GB

Alex Druk: I compared Jan and Mar aggregated data files. As you can see from enclosed data for many projects (eo-ps) are missing in March. 
Always ready to help...
Comment 1 Bingle 2014-04-13 15:50:18 UTC
Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1540
Comment 2 Erik Zachte 2014-04-13 15:57:40 UTC
Problem analyzed: earlier this week I re-enabled job dammit_compact_daily.sh which had not run since dumps server got migrated. So it had to a lengthy update cycle, generating some 20 daily files.

After all daily dumps have been generated the monthly aggregation script dammit_compact_monthly.sh is invoked. This should only find work to do once a month. 

But because dammit_compact_daily.sh had so much catching up to do the last step dammit_compact_monthly.sh was still running 24 hrs later, when the next daily cron job was started. This did not find the monthly files and also started the monthly aggregation phase. Clearly this monthly step should be protected against multiple instances.
Comment 3 Erik Zachte 2014-04-13 16:54:21 UTC
Protected dammit_compact_daily.sh and dammit_compact_monthly.sh against multiple concurrent invocations with flock
Comment 4 Toby Negrin 2014-04-14 19:11:25 UTC
Hi Erik -- how can we confirm this is fixed? IIRC you confirmed that this fix worked for a separate bug.

thanks,

-Toby
Comment 5 Erik Zachte 2014-04-14 19:28:06 UTC
Toby, the files have been regenerated properly. And I tested the new shielding with 'flock' against concurrent runs. So I will close this bug now. Cheers, Erik

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links