Last modified: 2014-06-14 23:01:27 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57542, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55542 - Double-counting of 'hide' logging
Double-counting of 'hide' logging
Status: RESOLVED WORKSFORME
Product: MediaWiki extensions
Classification: Unclassified
GuidedTour (Other open bugs)
master
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-10 00:28 UTC by Matthew Flaschen
Modified: 2014-06-14 23:01 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Matthew Flaschen 2013-10-10 00:28:05 UTC
There seems to be an excessive number of 'hide' events on the firstedit tour.  It's possible there's somehow double-counting.
Comment 1 Steven Walling 2013-10-10 00:35:01 UTC
This comes from examining the counts of event actions on the last tour step, for tour "firstedit". Like so: 

SELECT COUNT(*),event_action FROM GuidedTour_5222838 WHERE event_tourname = "firstedit" AND wiki = "enwiki" AND timestamp >= 20131009000000 AND event_step = 4 GROUP BY event_action;

This produces the following results:

175	button-click
285	complete
442	hide
286	impression

Some other steps in the tour also produce this discrepancy, for example the step 2 results:

173	button-click
345	hide
242	impression
Comment 3 Aaron Halfaker 2014-01-09 14:47:09 UTC
We do have some users who have more hide events recorded than impressions.  
(Note that user IDs have been censored for privacy.)

> SELECT event_userId, 
MIN(timestamp) AS first_event, 
SUM(event_action = "hide") as hides, 
SUM(event_action = "impression") as impressions 
FROM GuidedTour_5222838 
WHERE timestamp > "20131009" 
AND wiki = "enwiki"
GROUP BY event_userId 
HAVING SUM(event_action = "hide") > SUM(event_action = "impression") LIMIT 10;
+--------------+----------------+-------+-------------+
| event_userId | first_event    | hides | impressions |
+--------------+----------------+-------+-------------+
|     <snip>   | 20131013014124 |     6 |           5 |
|     <snip>   | 20131009001851 |     1 |           0 |
|     <snip>   | 20131011033720 |     4 |           2 |
|     <snip>   | 20131011163054 |     3 |           2 |
|     <snip>   | 20131012163322 |     2 |           1 |
|     <snip>   | 20131013060206 |     4 |           3 |
|     <snip>   | 20131013151144 |     2 |           1 |
|     <snip>   | 20131015082941 |     1 |           0 |
|     <snip>   | 20131015120432 |     5 |           3 |
|     <snip>   | 20131023143148 |     2 |           1 |
+--------------+----------------+-------+-------------+
10 rows in set (2.87 sec)

I picked out a user with his first event was well after the "20131009"
cutoff.  

> SELECT timestamp, event_action, event_tourName 
FROM GuidedTour_5222838 
WHERE event_userId = <snip> 
AND timestamp >= "20131009";
+----------------+--------------+-----------------------------+
| timestamp      | event_action | event_tourName              |
+----------------+--------------+-----------------------------+
| 20131013014124 | impression   | gettingstartedtasktoolbarve |
| 20131013014126 | hide         | gettingstartedtasktoolbarve |
| 20131013014131 | impression   | gettingstartedtasktoolbarve |
| 20131013014133 | hide         | gettingstartedtasktoolbarve |
| 20131013122419 | impression   | gettingstartedtasktoolbarve |
| 20131013122422 | hide         | gettingstartedtasktoolbarve | <--
| 20131013122427 | hide         | gettingstartedtasktoolbarve | <--
| 20131013122431 | impression   | gettingstartedtasktoolbarve |
| 20131013122433 | hide         | gettingstartedtasktoolbarve |
| 20131013122502 | impression   | gettingstartedtasktoolbarve |
| 20131013122507 | hide         | gettingstartedtasktoolbarve |
+----------------+--------------+-----------------------------+
11 rows in set (1.05 sec)

Note the two hide events occurring 5 seconds apart.  I see this sort of pattern
when I look through other users too.  We'll often have an "impression" followed
by one or more "hide"s that are separated by 5-10 seconds.
Comment 4 Matthew Flaschen 2014-04-01 23:23:21 UTC
This doesn't explain the counts not matching, but I don't think https://git.wikimedia.org/blob/mediawiki%2fextensions%2fGuidedTour.git/HEAD/modules%2fext.guidedTour.lib.js#L230 should use guiders._lastCreatedGuiderID .  I don't know why I didn't notice that before.  I think it could lead to the wrong ID being used, especially with the preloading.
Comment 5 Steven Walling 2014-06-14 23:01:27 UTC
This is likely not relevant anymore, since we've switched to a new schema for this.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links