Last modified: 2014-01-03 16:11:22 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T61472, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 59472 - DBQ-189 firsttime contributors to DEWIKI in June
DBQ-189 firsttime contributors to DEWIKI in June
Status: RESOLVED FIXED
Product: Tool Labs tools
Classification: Unclassified
Database Queries (Other open bugs)
unspecified
All All
: Unprioritized major
: ---
Assigned To: Bugzilla Bug Importer (valhallasw)
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-01-03 16:10 UTC by Bugzilla Bug Importer (valhallasw)
Modified: 2014-01-03 16:11 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bugzilla Bug Importer (valhallasw) 2014-01-03 16:10:59 UTC
This issue was converted from https://jira.toolserver.org/browse/DBQ-189.
Summary: firsttime contributors to DEWIKI in June 
Issue type: Task - A task that needs to be done.
Priority: Major
Status: Done
Assignee: Hoo man <hoo@online.de>

-------------------------------------------------------------------------------
From: Gregor Martynus <gregor@martynus.net>
Date: Mon, 02 Jul 2012 19:03:53
-------------------------------------------------------------------------------

I'd need 3 things for a study of a friend of mine:

1. users that manually signed up in 2012 and have not been blocked.  
2. All pages  
3. revisions by the users from 1.

As a Sidenode, could I have a SQL dump of a small wikipedia like simple.wikipedia.org so I can build and test the queries myself, so you don't have to? Tables: user (without sensitive columns of course), ipblocks, logging, revision, page.

Could anybody run these queries against DE wiki and send me the results? I think the SQL queries should look like this:
    
    -- Users that manually signed up in 2012, flagged if they have been blockd
    
    SELECT
      user_id,
      user_name,
      log_timestamp AS signed_up_at
      ipb_timestamp AS blocked_at
    FROM
      user
    LEFT JOIN logging  ON user_id = log_user
    LEFT JOIN ipblocks ON user_id = ipb_user
    WHERE type = 'new_users' 
      AND action = 'create'
      AND log_timestamp > "2012";
    
    -- all pages
    SELECT
      page_id,
      page_name,
      page_namespace,
      page_is_redirect,
    FROM
      page;
    
    -- revisions from 2012
    SELECT
      rev_id,
      rev_page,
      rev_comment,
      rev_deleted,
      rev_user,
      rev_timestamp
    FROM
      revision
    WHERE
      rev_timestamp > "2012";
    
    
    
    
    -- THANKS A LOT
Comment 1 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:01 UTC
-------------------------------------------------------------------------------
From: Hoo man <hoo@online.de>
Date: Mon, 02 Jul 2012 19:25:43
-------------------------------------------------------------------------------

Ok, I got a few questions regarding this ticket:  
First of all, is it ok, if I exclude users which are blocked? There might be privacy issues with the names etc. Same goes for deleted revisions in 3.  
Then for the third, is it ok, if it only includes the users from 1? As your query doesn't do this atm.

Finally, no I can't dump simplewiki_p (alone the revision database will have several GiB, I guess), I'll just write the queries myself, as they are rather trivial and that should be faster for me than dumping a wiki.
Comment 2 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:03 UTC
-------------------------------------------------------------------------------
From: Gregor Martynus <gregor@martynus.net>
Date: Mon, 02 Jul 2012 19:56:44
-------------------------------------------------------------------------------

thanks a lot for your quick response!

> First of all, is it ok, if I exclude users which are blocked? There might be privacy issues with the names etc. Same goes for deleted revisions in 3.

Yes, sure, you can exclude users that are blocked and revisions that have been deleted.

> Then for the third, is it ok, if it only includes the users from 1? As your query doesn't do this atm.

Absolutely, actually I'd prefer that, I just didn't want to make the query to complex.

Thanks a lot for your help!
Comment 3 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:04 UTC
-------------------------------------------------------------------------------
From: Hoo man <hoo@online.de>
Date: Mon, 02 Jul 2012 23:06:43
-------------------------------------------------------------------------------

As told, this was an easy one, so here we go:  
SQL:
    
    --1
    SELECT user_id, user_name, user_registration FROM user INNER JOIN logging ON log_user = user_id WHERE LEFT(user_registration, 4) = 2012 AND user_id NOT IN (SELECT ipb_user FROM ipblocks) AND log_type = 'newusers' AND log_action = 'create';
    --2
    SELECT page_id, page_title, page_namespace, page_is_redirect FROM page;
    --3
    INSERT INTO u_hoo.dbq189 SELECT user_name FROM user INNER JOIN logging ON log_user = user_id WHERE LEFT(user_registration, 4) = 2012 AND user_id NOT IN (SELECT ipb_user FROM ipblocks) AND log_type = 'newusers' AND log_action = 'create';
    SELECT rev_id, rev_page, rev_comment, rev_user_text, rev_user, rev_timestamp FROM revision INNER JOIN u_hoo.dbq189 ON rev_user_text = dbq189.user_name WHERE rev_deleted = 0 AND rev_user != 0;

Result:  
http://toolserver.org/~hoo/dbq/dbq-189_0.txt (plain text)  
http://toolserver.org/~hoo/dbq/dbq-189_1.txt (plain text)  
http://toolserver.org/~hoo/dbq/dbq-189_2.txt (plain text)
Comment 4 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:06 UTC
-------------------------------------------------------------------------------
From: Gregor Martynus <gregor@martynus.net>
Date: Tue, 03 Jul 2012 23:15:07
-------------------------------------------------------------------------------

thanks Hoo man, you saved me so much work, really appreciate your help!!
Comment 5 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:07 UTC
-------------------------------------------------------------------------------
From: Gregor Martynus <gregor@martynus.net>
Date: Wed, 04 Jul 2012 16:57:05
-------------------------------------------------------------------------------

sorry to bother you once more,  
but there seem data to be missing.

Example:
    
    SELECT * FROM revisions WHERE rev_user = 1426164;

Returns 5 results. But when I open http://de.wikipedia.org/wiki/Spezial:Beitr%C3%A4ge/Johannes_Michels I see way more contributions by this user. Does that make any sense to you?
Comment 6 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:09 UTC
-------------------------------------------------------------------------------
From: Hoo man <hoo@online.de>
Date: Wed, 04 Jul 2012 19:42:00
-------------------------------------------------------------------------------

hoo@yarrow:~/public_html/dbq$ cat dbq-189_2.txt | grep -P '\t1426164\t' | wc -l
    32
    

So the problem seems to be on your side, maybe you're not importing the data properly?
Comment 7 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:11 UTC
-------------------------------------------------------------------------------
From: Gregor Martynus <gregor@martynus.net>
Date: Thu, 05 Jul 2012 00:02:40
-------------------------------------------------------------------------------

you are right, sorry. I used a MAC app, I didn't get any errors, but data was indeed missing. Now I used mysqlimport and it works. Thanks again!
Comment 8 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:12 UTC
-------------------------------------------------------------------------------
From: Gregor Martynus <gregor@martynus.net>
Date: Fri, 06 Jul 2012 00:00:44
-------------------------------------------------------------------------------

Quick question regarding reverted revisions:

I asked initially for the rev_deleted column because I thought it would be the flag for reverted revisions, but I guess that's not the case, right?

Is there any chance to get an additional flag if a revision has been reverted? If yes, could you add it to the dbq-189_2.txt dump? It would really help a lot for my friend's study
Comment 9 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:14 UTC
-------------------------------------------------------------------------------
From: Hoo man <hoo@online.de>
Date: Fri, 06 Jul 2012 00:04:14
-------------------------------------------------------------------------------

Sadly, that's not possible, rev_deleted indeed stores whether the revision has been deleted and not if it has been reverted.
Comment 10 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:15 UTC
-------------------------------------------------------------------------------
From: Gregor Martynus <gregor@martynus.net>
Date: Fri, 06 Jul 2012 00:07:10
-------------------------------------------------------------------------------

Thanks for the quick response!

I'm just wondering: how can a revision be highlighted as reverted, when there is now such flag in the database? Is there a trick I could use myself to identify revisions that have been reverted?
Comment 11 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:17 UTC
-------------------------------------------------------------------------------
From: Hoo man <hoo@online.de>
Date: Fri, 06 Jul 2012 00:11:25
-------------------------------------------------------------------------------

The only way I see at the moment would be to get follow-up revisions and check whether they're reverts, getting those shouldn't be to complicated, but detecting reverts will be more hard. And the data gathered that way wont be fully reliable either, as there are various cases in those you wont be able to detect reverts that way.
Comment 12 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:18 UTC
-------------------------------------------------------------------------------
From: Gregor Martynus <gregor@martynus.net>
Date: Fri, 06 Jul 2012 00:13:55
-------------------------------------------------------------------------------

okay, thanks a lot for your instant help!
Comment 13 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:20 UTC
-------------------------------------------------------------------------------
From: Gregor Martynus <gregor@martynus.net>
Date: Thu, 06 Sep 2012 23:22:31
-------------------------------------------------------------------------------

May I ask you to run the same 3 queries from the comment above and provide me the results? Thanks for your help!
Comment 14 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:22 UTC
This bug was imported as RESOLVED. The original assignee has therefore not been
set, and the original reporters/responders have not been added as CC, to
prevent bugspam.

If you re-open this bug, please consider adding these people to the CC list:
Original assignee: hoo@online.de
CC list: hoo@online.de

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links