Last modified: 2014-04-21 18:44:32 UTC
While the number of lines in the zero tsv showed only a very small weekly pattern recently, we now see a drop from 250K-260K lines per file before the 2014-02-08 file to 230K-240K lines per file afterwards [1]. However, it does not seem to affect pageviews much. Do we know if something occurred, or is this just expected change? [1] ___________________________________________________________ qchris@stat1002 // 0 // 17:40:00 cwd: ~ for f in /a/squid/archive/zero/zero.tsv.log-201402* ; do echo "$f $(zcat $f | wc -l)" ; done /a/squid/archive/zero/zero.tsv.log-20140201.gz 2524492 /a/squid/archive/zero/zero.tsv.log-20140202.gz 2576161 /a/squid/archive/zero/zero.tsv.log-20140203.gz 2630024 /a/squid/archive/zero/zero.tsv.log-20140204.gz 2561174 /a/squid/archive/zero/zero.tsv.log-20140205.gz 2598546 /a/squid/archive/zero/zero.tsv.log-20140206.gz 2603902 /a/squid/archive/zero/zero.tsv.log-20140207.gz 2549100 /a/squid/archive/zero/zero.tsv.log-20140208.gz 2365688 /a/squid/archive/zero/zero.tsv.log-20140209.gz 2345631 /a/squid/archive/zero/zero.tsv.log-20140210.gz 2446156 /a/squid/archive/zero/zero.tsv.log-20140211.gz 2389240 /a/squid/archive/zero/zero.tsv.log-20140212.gz 2410005
Hi Dan -- I think you've been corresponding with Christian on this. Ideally I'd like to prioritize this under the dashboards so we can get those done. Please advise. thanks, -Toby
Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1446
Meh. Stupid me. I lost a digit :-( (In reply to christian from comment #0) > we now see a drop from 250K-260K lines per > file before the 2014-02-08 file to 230K-240K lines [...] 250K-260K is actually 2.5M-2.6M. 230K-240K is actually 2.3M-2.4M.
(Bringing information from some private emails back to the bug) Adam had a look and it seems there was a related drop in opensearch (~160K requests/day [1]) [2], but also an increase in pageimage (~60K requests/day [3]). That accounts in total for a drop of ~100K requests/day. So ~100K requests/day of the ~200K requests/day are still looking for an explanation. [1] ___________________________________________________________ qchris@stat1002 // 0 // 15:39:31 cwd: ~/tmp/410-01-2014-03-13 for i in /a/squid/archive/zero/zero.tsv.log-201402* ; do echo "$i ; $(zcat $i | cut -f 9 | grep -c opensearch)" ; done [...] /a/squid/archive/zero/zero.tsv.log-20140204.gz ; 268693 /a/squid/archive/zero/zero.tsv.log-20140205.gz ; 280478 /a/squid/archive/zero/zero.tsv.log-20140206.gz ; 284839 /a/squid/archive/zero/zero.tsv.log-20140207.gz ; 239042 /a/squid/archive/zero/zero.tsv.log-20140208.gz ; 126370 /a/squid/archive/zero/zero.tsv.log-20140209.gz ; 128800 /a/squid/archive/zero/zero.tsv.log-20140210.gz ; 131217 /a/squid/archive/zero/zero.tsv.log-20140211.gz ; 120413 [...] [2] The change in search was attributed to: https://gerrit.wikimedia.org/r/#q,I1c657fd46a2b5be4f27aa508a5cc0d946d6b98a8,n,z [3] ___________________________________________________________ qchris@stat1002 // 0 // 15:39:18 cwd: ~ for i in /a/squid/archive/zero/zero.tsv.log-201402* ; do echo "$i ; $(zcat $i | cut -f 9 | grep -c pageimage)" ; done [...] /a/squid/archive/zero/zero.tsv.log-20140204.gz ; 296 /a/squid/archive/zero/zero.tsv.log-20140205.gz ; 71 /a/squid/archive/zero/zero.tsv.log-20140206.gz ; 69 /a/squid/archive/zero/zero.tsv.log-20140207.gz ; 18410 /a/squid/archive/zero/zero.tsv.log-20140208.gz ; 63953 /a/squid/archive/zero/zero.tsv.log-20140209.gz ; 64382 /a/squid/archive/zero/zero.tsv.log-20140210.gz ; 64972 /a/squid/archive/zero/zero.tsv.log-20140211.gz ; 57304 [...]