Last modified: 2014-07-16 20:05:54 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T61943, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 59943 - Fix all the Wikia stats
Fix all the Wikia stats
Status: NEW
Product: Wikimedia Labs
Classification: Unclassified
wikistats (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Daniel Zahn
:
Depends on:
Blocks: 36291
  Show dependency treegraph
 
Reported: 2014-01-11 13:33 UTC by LWChris
Modified: 2014-07-16 20:05 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description LWChris 2014-01-11 13:33:15 UTC
All Wikia stats on http://wikistats.wmflabs.org/largest_html.php failed to refresh since 2012-04-20 now. It's about time to finally fix this after 631 days.

Either use api.php?action=query&meta=siteinfo&siprop=statistics&format=xml
or Special:Statistics?action=raw

Examples:
http://lyrics.wikia.com/api.php?action=query&meta=siteinfo&siprop=statistics&format=xml
http://lyrics.wikia.com/Special:Statistics?action=raw

I think it shouldn't be very difficult to update the script.

Thanks in advance.

LWChris, Admin@LyricWiki
Comment 1 Nemo 2014-01-12 09:07:36 UTC
The stats are not broken, they were stopped (as far as I know) because Wikia complained about the number of requests to their API. I suggest you to write community@wikia.com and make them comment somewhere (e.g. this bug) that it's ok to make requests to their API.
Comment 2 Daniel Zahn 2014-01-16 15:22:26 UTC
yea, we just stopped them. and it seemed too extreme to update them even just once in 24hrs because there are just so many and we also did not have a good way to sync the list of current existing wikis. the plan was always to maybe ask Wikia if it's possible to provide all the stats from their DB somehow. no idea if that's easy for them or hard. but it would avoid tens of thousands of requests to each single API
Comment 3 Nemo 2014-01-21 12:07:28 UTC
So, I finally found the file responsible of updates navigating the obscure repo tree: http://git.wikimedia.org/blob/operations%2Fdebs%2Fwikistats.git/HEAD/usr%2Flib%2Fwikistats%2Fupdate.php

If I understand correctly the update is run every 24h in a cron. The simplest change I can think of is:
1) add a sleep time of 1 second between a request and the following;
2) if a table has 1000 wikis or less (or is "mediawikis"), update them all;
3) if a table has 1000 wikis or more, update only 1000-1500, in this way:
  a) start from those whose last update was earlier,
  b) first update up to 500 wikis with more than 100 articles,
  c) then update up to 1000 of the other wikis.

In this way we would update the whole Wikia table in a month (or one year once it's completely filled) but have data 10 or so days old at most for the bigger wikis. And the cron would always run in a reasonable time.
Comment 4 Gerrit Notification Bot 2014-01-21 12:54:59 UTC
Change 108670 had a related patch set uploaded by Nemo bis:
[s23.org wikistats] Throttle updates for big farms, keep updating big wikis' stats

https://gerrit.wikimedia.org/r/108670
Comment 5 Ricordisamoa 2014-02-11 17:23:35 UTC
The patch has been merged, but I'm still seeing outdated statistics.
Comment 6 Nemo 2014-02-11 17:27:27 UTC
Yes, as noted it the commit message the patch doesn't actually enable updates. just pave the way for them. I suppose it's a hardcoded crontab, unless there's some other repo I missed.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links