Last modified: 2014-01-03 16:11:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T61474, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 59474 - DBQ-191 Number of articles with files on Wikiquote
DBQ-191 Number of articles with files on Wikiquote
Status: RESOLVED FIXED
Product: Tool Labs tools
Classification: Unclassified
Database Queries (Other open bugs)
unspecified
All All
: Unprioritized major
: ---
Assigned To: Bugzilla Bug Importer (valhallasw)
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-01-03 16:11 UTC by Bugzilla Bug Importer (valhallasw)
Modified: 2014-01-03 16:11 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:34 UTC
This issue was converted from https://jira.toolserver.org/browse/DBQ-191.
Summary: Number of articles with files on Wikiquote
Issue type: Task - A task that needs to be done.
Priority: Major
Status: Done
Assignee: Hoo man <hoo@online.de>

-------------------------------------------------------------------------------
From: Federico Leva <federicoleva@tiscali.it>
Date: Mon, 16 Jul 2012 13:43:49
-------------------------------------------------------------------------------

I'd like to know how many articles have at least an embedded image on each Wikiquote, or at least it and en.quote. WikiStats gives only the total number of image links. Thanks.
Comment 1 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:36 UTC
-------------------------------------------------------------------------------
From: Liangent <liangent@gmail.com>
Date: Sat, 28 Jul 2012 22:06:56
-------------------------------------------------------------------------------

How is "article" defined?
Comment 2 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:37 UTC
-------------------------------------------------------------------------------
From: Federico Leva <federicoleva@tiscali.it>
Date: Sat, 28 Jul 2012 22:12:39
-------------------------------------------------------------------------------

Pages in namespace 0.  
There's no need to worry about redirects and definitions because pages with an image are "good articles" according to every definition, and I'm not asking the total count of articles.  
Thank you!
Comment 3 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:39 UTC
-------------------------------------------------------------------------------
From: Liangent <liangent@gmail.com>
Date: Sat, 28 Jul 2012 22:58:59
-------------------------------------------------------------------------------

http://toolserver.org/~liangent/-/dbq/191

Main pages are often counted as articles.
Comment 4 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:40 UTC
-------------------------------------------------------------------------------
From: Liangent <liangent@gmail.com>
Date: Sat, 28 Jul 2012 22:59:52
-------------------------------------------------------------------------------

$ for wiki in `mysql -BN -h sql -e "SELECT dbname FROM toolserver.wiki WHERE domain IS NOT NULL AND family = 'wikiquote';"`  
> do  
> echo -n $wiki ' ' | tee -a 191  
> mysql -BcN -h ${wiki/_/-}.rrdb $wiki -e 'SELECT COUNT(DISTINCT page_id) FROM page JOIN imagelinks ON il_from = page_id WHERE page_namespace = 0' | tee -a 191  
> done
Comment 5 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:42 UTC
-------------------------------------------------------------------------------
From: Federico Leva <federicoleva@tiscali.it>
Date: Sun, 29 Jul 2012 11:10:26
-------------------------------------------------------------------------------

Thank you very much.  
However, there's an important thing we've forgotten, that is that only images embedded directly on the page must be considered, not images embedded by templates (icons and such): can those be filtered? An idea would be to exclude images which have imagelinks from the template page itself orr, if this is not reliable enough, all images which have more than e.g. 5 imagelinks in the wiki.  
(The reopen button is not working in Firefox for me.)
Comment 6 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:43 UTC
-------------------------------------------------------------------------------
From: Hoo man <hoo@online.de>
Date: Tue, 02 Oct 2012 00:05:01
-------------------------------------------------------------------------------

I've did it as suggested now, so that only images that are linked less than 5 times are taken into account.  
Script:
    
    #!/bin/bash
    wikilist=$(mysql -wBN -h sql -e "SELECT dbname FROM toolserver.wiki WHERE domain IS NOT NULL AND family = 'wikiquote';");
    SQL="SELECT DATABASE(), COUNT(DISTINCT page_id) FROM page JOIN (SELECT il_from FROM imagelinks GROUP BY il_from HAVING COUNT(*) < 5) AS img_links ON il_from = page_id WHERE page_namespace = 0";
    echo -e "wiki\tarticle_count" > dbq-191.txt;
    for runwiki in $wikilist
    do
    	echo "querying $runwiki"
    	sql -r $runwiki "$SQL" | tail -n +2 >> dbq-191.txt
    done

Result:  
http://toolserver.org/~hoo/dbq/dbq-191.txt
Comment 7 Bugzilla Bug Importer (valhallasw) 2014-01-03 16:11:45 UTC
This bug was imported as RESOLVED. The original assignee has therefore not been
set, and the original reporters/responders have not been added as CC, to
prevent bugspam.

If you re-open this bug, please consider adding these people to the CC list:
Original assignee: hoo@online.de
CC list: federicoleva@tiscali.it, hoo@online.de, liangent@gmail.com

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links