Last modified: 2014-02-25 12:26:49 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T62955, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 60955 - No sampled-1000 tsv file for 2014-02-06 on stat1002
No sampled-1000 tsv file for 2014-02-06 on stat1002
Status: RESOLVED FIXED
Product: Analytics
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: nuria
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-02-06 12:17 UTC by christian
Modified: 2014-02-25 12:26 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description christian 2014-02-06 12:17:30 UTC
Although new tsv files typically appear some time after 08:00 on stat1002,
today's (2014-02-06) sampled-1000 tsv file is not yet there [1] some 4 hours
afterwards.

Due to the file not being there, daily jobs that rely on the daily files
being in place break. For example, today's wikipedia-zero run just failed.




[1]
___________________________________________________________
qchris@stat1002 // 0 // 12:08:05                                  
cwd: ~
ll /a/squid/archive/sampled/sampled-1000.tsv.log-201402*
-rw-r--r-- 1 stats stats 646898016 Feb  1 06:25 /a/squid/archive/sampled/sampled-1000.tsv.log-20140201.gz
-rw-r--r-- 1 stats stats 646896549 Feb  2 06:25 /a/squid/archive/sampled/sampled-1000.tsv.log-20140202.gz
-rw-r--r-- 1 stats stats 739443262 Feb  3 06:25 /a/squid/archive/sampled/sampled-1000.tsv.log-20140203.gz
-rw-r--r-- 1 stats stats 728781897 Feb  4 06:25 /a/squid/archive/sampled/sampled-1000.tsv.log-20140204.gz
-rw-r--r-- 1 stats stats 723531018 Feb  5 06:25 /a/squid/archive/sampled/sampled-1000.tsv.log-20140205.gz
Comment 1 christian 2014-02-06 12:19:32 UTC
Checking on emery shows that the sampled-1000 file is ready on emery.
Let's see if it syncs over toworrow.
Comment 2 Bingle 2014-02-06 12:20:28 UTC
Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1432
Comment 3 christian 2014-02-06 13:03:23 UTC
Going through recent changes in the puppet repo, it seems the merge of
e405e7e622cb1a275b3c2acb65aad0ee4c1d7729 in the puppet repo is related:
https://gerrit.wikimedia.org/r/#/c/110382
Comment 4 Toby Negrin 2014-02-06 20:55:36 UTC
Nuria has made some changes in the mingle ticket to fix this; will confirm tomorrow that issue is resolved.
Comment 5 Gerrit Notification Bot 2014-02-08 13:40:07 UTC
Change 112233 had a related patch set uploaded by QChris:
Mark 2014-02-05, and 2014-02-06 as bad dates

https://gerrit.wikimedia.org/r/112233
Comment 6 nuria 2014-02-10 17:21:43 UTC
Verified that there are sampled logs for the last couple of days:

719726187 Feb  9 06:25 sampled-1000.tsv.log-20140209.gz
811851138 Feb 10 06:25 sampled-1000.tsv.log-20140210.gz


Bug can be closed.
Comment 7 christian 2014-02-10 23:22:13 UTC
Good to see the new files on stat1002 again! Thanks.

But since this was no outtake but a filter change, we should have
good tsvs lying around on emery for stat1002's missing/bad files.
Could you bring the good files over so we can rerun the jobs that
failed due to the missing/bad tsvs?

Also, could you please update the corresponding row in
documentation table on
  https://wikitech.wikimedia.org/wiki/Analytics/Requests_stream
?
Comment 8 nuria 2014-02-13 10:45:20 UTC
I will put in an R2 request to get access to emery
Comment 9 nuria 2014-02-13 10:46:17 UTC
Sorry, "RT" request.
Comment 10 nuria 2014-02-14 15:06:18 UTC
Missing file has been restored.
Comment 11 christian 2014-02-17 10:54:18 UTC
(In reply to nuria from comment #10)
> Missing file has been restored.

No more missing tsvs. Thanks!

But comment 7 mentions two other things, which did not yet
happen. Hence, reopening.

(In reply to christian from comment #7)
> [ bad files ]

Just doing a plain “ll” on stat1002, the file for 2014-02-07 looks wrong.

___________________________________________________________
qchris@stat1002 // 0 // 10:43:43
cwd: ~
cd /a/squid/archive/sampled ; ll sampled-1000.tsv.log-2014020*
-rw-r--r-- 1 stats stats  646898016 Feb  1 06:25 sampled-1000.tsv.log-20140201.gz
-rw-r--r-- 1 stats stats  646896549 Feb  2 06:25 sampled-1000.tsv.log-20140202.gz
-rw-r--r-- 1 stats stats  739443262 Feb  3 06:25 sampled-1000.tsv.log-20140203.gz
-rw-r--r-- 1 stats stats  728781897 Feb  4 06:25 sampled-1000.tsv.log-20140204.gz
-rw-r--r-- 1 stats stats  723531018 Feb  5 06:25 sampled-1000.tsv.log-20140205.gz
-rw-r--r-- 1 stats stats  728198594 Feb  6 06:25 sampled-1000.tsv.log-20140206.gz
-rw-r--r-- 1 stats stats 1022924002 Feb  7 06:25 sampled-1000.tsv.log-20140207.gz
-rw-r--r-- 1 stats stats  753216484 Feb  8 06:25 sampled-1000.tsv.log-20140208.gz
-rw-r--r-- 1 stats stats  719726187 Feb  9 06:25 sampled-1000.tsv.log-20140209.gz


(In reply to christian from comment #7)
> Also, could you please update the corresponding row in
> documentation table on
>   https://wikitech.wikimedia.org/wiki/Analytics/Requests_stream
> ?

This has not happened either.
Comment 12 christian 2014-02-20 12:57:34 UTC
Nuria pinged me that the 20140207 file is available on stat1002.
Thanks for the file!

The new file seems to miss a few lines, but since it is a sampled file
anyway, and the drop is way below the usual network drop, it's fine by
me.
Comment 13 christian 2014-02-21 11:52:18 UTC
The good file for 20140207 is gone again, and we're back with the ~1GB
file for 20140207 :-(

@Nuria, are you sure you did copy the fixed file to the rsync source as
we said yesterday in IRC:
  Feb 20 12:25:28 <qchris>        Afterwards get it to the rsync source.
?
Comment 14 nuria 2014-02-21 18:15:03 UTC
Sorry, I missunderstood rsync source, I thought you were talking about the directory in stat1002. It is fixed now so file should not change going forward.
Comment 15 christian 2014-02-22 19:29:42 UTC
(In reply to nuria from comment #14)
> Sorry, I missunderstood rsync source, [...]

No worries.
The file is there again \o/
Thanks!

The file owner changed, but meh. It is world readable anyways :-)

But could you please clean up the left behind cruft file, as that
would sooner or later get in the way when doing adhoc checks?
Comment 16 nuria 2014-02-24 15:14:35 UTC
All deleted and good.
Comment 17 christian 2014-02-25 10:59:18 UTC
@Nuria: The files now look good. Thanks \o/

But again ... could you please also update the corresponding
row in documentation table on
  https://wikitech.wikimedia.org/wiki/Analytics/Requests_stream
?
Comment 18 christian 2014-02-25 12:26:49 UTC
Documentation has been updated by Nuria.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links