Last modified: 2012-03-21 22:35:33 UTC
During a #wikimedia-strategy brainstorming session, regarding a Question of the Week: "What changes to Wikimedia's technology would enable a friendlier and more welcoming environment?") it was suggested that the main issues could be addressed thusly: [20:20] <jimmyps> we could address the first two (tallest) bars on http://strategy.wikimedia.org/wiki/File:091207_QOTW.png simply by publicizing statistics from http://stats.grok.se [20:21] <eekim> jimmyps: the key question is, how would you publicize it, and how would you measure if you were being effective? [20:22] <jimmyps> eekim: for each article find the top 10 articles also in its categories and list them in order on the sidebar after the interwikis with "x,xxx views/month" right-justified on every other line after each of the 10 [20:23] <jimmyps> that would indicate to people the most popular subjects that they are also interested in [20:23] <jimmyps> this could be done in batch mode [20:25] <jimmyps> does anyone disagree that listing the most popular "related articles" with their viewership counts on the sidebar after the interwikis would address the largest leftmost two bars on http://strategy.wikimedia.org/wiki/File:091207_QOTW.png ? (no disagreements were forthcoming) Would someone who understands what is and is not possible with bots and MediaWiki please comment on the feasibility of this proposal? Thank you. 99.62.186.125 (talk) 04:49, 9 December 2009 (UTC) it would be possible, the best method would probably be a toolserver acc with a javascript function that retrieves the data from the toolserver once we set the rules for what is and is not related. βcommand 04:52, 9 December 2009 (UTC) Even better would be to have a statistics tab next to history. With graphs of metrics like readability, bytes size, html size, word count, number of references, incoming link count (backlinks), outgoing link count (links), traffic statistics, and possibly something like history flow. And maybe be able to compare to other pages. If the caching is done right it could be done on the toolserver. — Dispenser 05:41, 9 December 2009 (UTC)\ Perhaps 'related' is everything wikilinked and everything in the same categories? 99.62.186.125 (talk) 18:29, 9 December 2009 (UTC) The same algorithm as related changed I would say. Rich Farmbrough, 09:20, 15 December 2009 (UTC). Note: per http://stats.grok.se/about the statistics are not from the toolserver, they are from http://dammit.lt/wikistats/ -- per Brion, the upstream data is from a wikimedia internal source.
*** Bug 6689 has been marked as a duplicate of this bug. ***
After some discussion on #mediawiki this is what I gathered. Pageview tracking is disabled on Wikipedia because of performance reasons. Erik Zachte already does some analysis of Squid logs but I am not sure about the accuracy, frequency and technical details. So I am adding Roan and Erik to this thread :)
(#mediawiki) rtmprus: domas: is bug 21921 with a round-robin iteration of article space more appropriate for the toolserver, dammit.lt, or somewhere else? [4:12pm] Platonides: I don't see any work there for dammit.lt [4:13pm] rtmprus: I don't want to pound its bandwidth if local copies of popularity logs aren't available on the toolserver [4:13pm] Platonides: it would be a work for the toolserver [4:13pm] Platonides: or http://stats.grok.se if he wants to do it [4:13pm] Platonides: the toolserver already downloads copies [4:13pm] Platonides: they are at a common folder ... rtmprus: oh good Platonides: I don't completely understand the algorithm they propose, but it surely can be done [4:17pm] rtmprus: someone suggested members of the same categories and wikilinks, and someone else suggested the Special:RecentChangesLinked algorithm
Thanks to mikelifeguard, https://wiki.toolserver.org/view/User-store says squid traffic logs live in /mnt/user-store/ on the toolserver.
[12:57] <mikelifeguard> jps: river said "A user has made this available in raw form at /mnt/user-store/stats"