Last modified: 2014-01-03 16:04:42 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T61413, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 59413 - DBQ-151 Need the revision histories from editors who made at least one revision or edit to (only) namespaces 0-5 during (only) the time period of April 1st, 2009 to March 31st 2010.
DBQ-151 Need the revision histories from editors who made at least one revisi...
Status: RESOLVED FIXED
Product: Tool Labs tools
Classification: Unclassified
Database Queries (Other open bugs)
unspecified
All All
: Unprioritized major
: ---
Assigned To: Bugzilla Bug Importer (valhallasw)
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-01-03 16:04 UTC by Bugzilla Bug Importer (valhallasw)
Modified: 2014-01-03 16:04 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bugzilla Bug Importer (valhallasw) 2014-01-03 16:04:32 UTC
This issue was converted from https://jira.toolserver.org/browse/DBQ-151.
Summary: Need the revision histories from editors who made at least one revision or edit to (only) namespaces 0-5 during (only) the time period of April 1st, 2009 to March 31st 2010.
Issue type: Task - A task that needs to be done.
Priority: Major
Status: Done
Assignee: Hoo man <hoo@online.de>

-------------------------------------------------------------------------------
From: Jmmalo04 <jmmalo03@gmail.com>
Date: Wed, 17 Aug 2011 01:11:00
-------------------------------------------------------------------------------

It is easiest to describe this request in two steps:

(1) I need a list of the editors who made one or more revisions to namespaces 0-5 during the period of April 1st, 2009 to March 31st 2010. If an editor did not make at least one edit to namespaces 0-5 during the aforementioned time period, they should not be included. 

(2) From this list of editors, please select (randomly as possible) 100,000 editors. For each of these editors I need a history of all their revisions. For each revision I need the following information:  
(a) Timestamp from each revision made by the editor  
(b) The increase / decrease in number of characters compared with the previous revision of the article  
(c) Was the editors revision reverted? (yes/no)  
(d) The namespace of the revision  
(e) The current size in number of characters of the revision  
(f) If it is available, the category of the article in the Wikimedia Taxonomy Project. 

I'm new at this so please forgive me if my request is missing some details. I will be watching this closely so feel free to ask me questions if anything is unclear, I will answer promptly. Thanks in advance for your help!
Comment 1 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:04:34 UTC
-------------------------------------------------------------------------------
From: Hoo man <hoo@online.de>
Date: Wed, 17 Aug 2011 16:46:25
-------------------------------------------------------------------------------

I don't think that this is feasible the way you request it. The enwiki revision table is really big, so doing this for more than one month at a time might not be doable.... furthermore selecting user contribs. for over 100k users will be to much data either...
Comment 2 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:04:36 UTC
-------------------------------------------------------------------------------
From: Jmmalo04 <jmmalo03@gmail.com>
Date: Fri, 19 Aug 2011 07:46:29
-------------------------------------------------------------------------------

Thanks for your quick response. I've been learning to use the wiki API and that has helped a lot. However, I still need to get a list of all the users who made one or more edits to namespaces 0-5 during the time period of April 1st, 2009 to March 31st 2010. Is this request possible? I only need a list of user names now, I should be able to do the rest myself on the API.

Thanks, Jordan
Comment 3 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:04:37 UTC
-------------------------------------------------------------------------------
From: Jmmalo04 <jmmalo03@gmail.com>
Date: Fri, 19 Aug 2011 15:52:59
-------------------------------------------------------------------------------

Also, I do not need all the names. I could accept a random (or random as possible) subset of, preferably 100,000, or even just 50,000 if that's easier.
Comment 4 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:04:39 UTC
-------------------------------------------------------------------------------
From: Hoo man <hoo@online.de>
Date: Mon, 22 Aug 2011 13:13:45
-------------------------------------------------------------------------------

SQL:
    
    INSERT /* SLOW_OK */ INTO u_hoo.dbq151 (user_name) SELECT DISTINCT rev_user_text FROM revision INNER JOIN page ON rev_page = page_id WHERE rev_user != 0 AND LEFT(rev_timestamp, 6) = 200904 AND page_namespace < 6 LIMIT 10000;

Which selects 10,000 users and saves them to a temp. table (ran it for every month). Afterwards I just needed to get all entries of that user list:
    
    SELECT DISTINCT * FROM u_hoo.dbq151;

Many users are in the temp table multiple times, so from the 120,000 only 57,825 unique ones have been selected.

Result:  
http://toolserver.org/~hoo/dbq/dbq-151.txt (plain text)
Comment 5 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:04:40 UTC
-------------------------------------------------------------------------------
From: Jmmalo04 <jmmalo03@gmail.com>
Date: Mon, 22 Aug 2011 16:23:45
-------------------------------------------------------------------------------

This will work! Thanks alot hoo man, I appreciate it!
Comment 6 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:04:42 UTC
This bug was imported as RESOLVED. The original assignee has therefore not been
set, and the original reporters/responders have not been added as CC, to
prevent bugspam.

If you re-open this bug, please consider adding these people to the CC list:
Original assignee: hoo@online.de
CC list: hoo@online.de

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links