Last modified: 2014-01-03 15:21:53 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T61256, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 59256 - DBQ-3 List of English Wikipedia users by number of articles they started
DBQ-3 List of English Wikipedia users by number of articles they started
Status: RESOLVED FIXED
Product: Tool Labs tools
Classification: Unclassified
Database Queries (Other open bugs)
unspecified
All All
: Unprioritized major
: ---
Assigned To: Bugzilla Bug Importer (valhallasw)
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-01-03 15:21 UTC by Bugzilla Bug Importer (valhallasw)
Modified: 2014-01-03 15:21 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bugzilla Bug Importer (valhallasw) 2014-01-03 15:21:45 UTC
This issue was converted from https://jira.toolserver.org/browse/DBQ-3.
Summary: List of English Wikipedia users by number of articles they started
Issue type: Task - A task that needs to be done.
Priority: Major
Status: Done
Assignee: Bryan Tong Minh <bryan@tools.wikimedia.de>

-------------------------------------------------------------------------------
From: Bryan Tong Minh <bryan@tools.wikimedia.de>
Date: Wed, 31 Oct 2007 13:10:20
-------------------------------------------------------------------------------

(see http://en.wikipedia.org/w/index.php?title=User_talk:Bryan&redirect=no#Heavy_duty_SQL_query_request)

I'd like to have a list of English Wikipedia users by number of articles they have started. Someone "started" an article if they make the first non-redirect edit to the article.
Comment 1 Bugzilla Bug Importer (valhallasw) 2014-01-03 15:21:46 UTC
-------------------------------------------------------------------------------
From: Daniel Kinzler <daniel@brightbyte.de>
Date: Wed, 31 Oct 2007 13:13:35
-------------------------------------------------------------------------------

sadly, finding the first revision for a given page is rather expensive. to get what you want, this would have to be done for every page, and the result then grouped by user. I can think of an efficient way to do that.
Comment 2 Bugzilla Bug Importer (valhallasw) 2014-01-03 15:21:48 UTC
-------------------------------------------------------------------------------
From: Bryan Tong Minh <bryan@tools.wikimedia.de>
Date: Thu, 01 Nov 2007 22:16:34
-------------------------------------------------------------------------------

I would come up with something like this:

SELECT lr_user, COUNT(lr_title) FROM (SELECT page_title AS lr_title, MIN(rev_id) AS lr_rev_id, rev_user_text AS lr_user FROM page, revision WHERE page_namespace = 0 AND rev_page = page_id GROUP BY rev_page) AS lowest_revision GROUP BY lr_user;

Unfortunately, that does not look like it will run very efficiently. Even an EXPLAIN takes ages.
Comment 3 Bugzilla Bug Importer (valhallasw) 2014-01-03 15:21:50 UTC
-------------------------------------------------------------------------------
From: Bryan Tong Minh <bryan@tools.wikimedia.de>
Date: Mon, 05 Nov 2007 09:20:49
-------------------------------------------------------------------------------

Final query:

mysql -hsql-s1 enwiki_p -e "SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; SELECT lr_user, COUNT(lr_title) AS lr_creation_count FROM (SELECT page_title AS lr_title, MIN(rev_id) AS lr_rev_id, rev_user_text AS lr_user FROM page, revision WHERE page_namespace = 0 AND page_is_redirect = 0 AND rev_page = page_id GROUP BY rev_page) AS lowest_revision GROUP BY lr_user HAVING lr_creation_count > 5 ORDER BY lr_creation_count DESC;" >enwiki_page_creation.txt

Approximate runtime: 30h

Results: http://tools.wikimedia.de/~bryan/stats/dbquery/enwiki_page_creation.txt
Comment 4 Bugzilla Bug Importer (valhallasw) 2014-01-03 15:21:51 UTC
-------------------------------------------------------------------------------
From: Dutchresearch <mail@basvaneijk.nl>
Date: Tue, 03 Mar 2009 15:20:30
-------------------------------------------------------------------------------

Hi i am new to the TS and i would like to have similar info. 

Could anyone help me get results for

member name | total started articles per user | total edits on articles per user | registration date | Last active date

I will use it for my thesis on UGC which i will publish CC 

i runned the SQL above on the sco.wiki took me 10s. ![][1]

but i want to adjust this SQL to get the info above

Bas

SELECT lr_user, COUNT(lr_title) AS lr_creation_count   
FROM   
(SELECT   
page_title AS lr_title,   
MIN(rev_id) AS lr_rev_id,   
rev_user_text AS lr_user   
FROM page, revision   
WHERE page_namespace = 0   
AND page_is_redirect = 0   
AND rev_page = page_id   
GROUP BY rev_page) AS lowest_revision   
GROUP BY lr_user   
HAVING lr_creation_count > 1  
ORDER BY lr_creation_count DESC

   [1]: https://jira.toolserver.org/images/icons/emoticons/smile.gif
Comment 5 Bugzilla Bug Importer (valhallasw) 2014-01-03 15:21:53 UTC
This bug was imported as RESOLVED. The original assignee has therefore not been
set, and the original reporters/responders have not been added as CC, to
prevent bugspam.

If you re-open this bug, please consider adding these people to the CC list:
Original assignee: Bryan.TongMinh@Gmail.com
CC list: Bryan.TongMinh@Gmail.com, daniel.kinzler@wikimedia.de

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links