Last modified: 2013-10-21 22:01:50 UTC
Editors get geocoded again and again for each day they are considered active. So if the GeoIP gets updated, active editors with jump to a different country from one day to the next. So consider I edit 5 pages on enwiki on 2013-08-10 using an IP address that's associated to Italy at that point in time. On 2013-08-29 the IP->Country database gets updated, and that previous IP is now marked as United States. The scripts would count me active for United States from that day on. +------------+---------+---------------+ | Date | Austria | United States | +------------+---------+---------------+ | 2013-08-11 | * | | | 2013-08-12 | * | | | ... | * | | | 2013-08-28 | * | | | 2013-08-29 | | * | | 2013-08-30 | | * | | ... | | * | | 2013-09-10 | | * | +------------+---------+---------------+
Mhmm. That table should have been +------------+---------+---------------+ | Date | Italy | United States | +------------+---------+---------------+ | 2013-08-11 | * | | | 2013-08-12 | * | | | ... | * | | | 2013-08-28 | * | | | 2013-08-29 | | * | | 2013-08-30 | | * | | ... | | * | | 2013-09-10 | | * | +------------+---------+---------------+
That behavior sounds sane enough to me. Do you recommend something alternative?
(In reply to comment #2) > That behavior sounds sane enough to me. Do you recommend something > alternative? I'd expect a table like: +------------+---------+---------------+ | Date | Italy | United States | +------------+---------+---------------+ | 2013-08-11 | * | | | 2013-08-12 | * | | | ... | * | | | 2013-08-28 | * | | | 2013-08-29 | * | | | 2013-08-30 | * | | | ... | * | | | 2013-09-10 | * | | +------------+---------+---------------+ If the 5 edits came from an Italian IP address when they were made, those 5 edits should count for Italy for the whole 30-day period, regardless where the IP "wanders" afterwards. Of course, I make 5 additional edits from the very same address on 2013-09-05 (that's after the Italy->US switch), the table should become: +------------+---------+---------------+ | Date | Italy | United States | +------------+---------+---------------+ | 2013-08-11 | * | | | 2013-08-12 | * | | | ... | * | | | 2013-08-28 | * | | | 2013-08-29 | * | | | 2013-08-30 | * | | | ... | * | | | 2013-09-04 | * | | | 2013-09-05 | * | * | | 2013-09-06 | * | * | | ... | * | * | | 2013-09-10 | * | * | +------------+---------+---------------+ On 2013-09-10, I'd be considered active editor in Italy due to the first 5 editors, when the IP was associated to Italy. And additionally, I am considered active editor for the United States due to the latter edits. (That would match the behaviour we're seeing if I do the first 5 edits on an IP that's always Italy, and the second 5 edits on an IP that's always United States)
Prioritization and scheduling of this bug is tracked on Mingle card https://mingle.corp.wikimedia.org/projects/analytics/cards/1221