Last modified: 2014-01-03 16:12:54 UTC
This issue was converted from https://jira.toolserver.org/browse/DBQ-201. Summary: Find Wikimedia Commons files without license Issue type: Task - A task that needs to be done. Priority: Major Status: Done Assignee: Tim.Landscheidt <tim@tim-landscheidt.de> ------------------------------------------------------------------------------- From: Jarek Tuszynski <jaroslaw.w.tuszynski@saic.com> Date: Tue, 12 Mar 2013 19:41:37 ------------------------------------------------------------------------------- Could someone run a query to find images without any of the following templates: License template tag PD-Layout GNU-Layout CC-Layout no license Delete Speedydelete The resulting files are not transcluding any of the standard licenses and are not labeled as such. In case there is a large number of those we can limit the search to 10k files, for now. I tested the query on smaller sets of files using CatScan2, which unfortunately does not have number of files limit and times out.
------------------------------------------------------------------------------- From: Jarek Tuszynski <jaroslaw.w.tuszynski@saic.com> Date: Tue, 02 Apr 2013 13:17:20 ------------------------------------------------------------------------------- In the mean time while waiting for this query I run several smaller queries with CatScan2 within medium size categories. I identify so far ~3.5k files and added them to http://commons.wikimedia.org/wiki/Category:Media_without_a_license:_needs_history_check for processing. So ideally the query would look for all the files missing the above list of templates and which are not in the Category:Media_without_a_license:_needs_history_check.
------------------------------------------------------------------------------- From: Jarek Tuszynski <jaroslaw.w.tuszynski@saic.com> Date: Wed, 22 May 2013 19:44:48 ------------------------------------------------------------------------------- the query would be select /* SLOW_OK */ page_title from page where page_is_redirect=0 and page_namespace=6 and not exists (select * from templatelinks where tl_from=page_id and tl_namespace=10 and tl_title in ("License_template_tag","PD-Layout","GNU-Layout","CC-Layout","No_license","Delete","Speedydelete") limit 1 )
------------------------------------------------------------------------------- From: Tim.Landscheidt <tim@tim-landscheidt.de> Date: Wed, 22 May 2013 22:05:45 ------------------------------------------------------------------------------- Run on Tools.
------------------------------------------------------------------------------- From: Jarek Tuszynski <jaroslaw.w.tuszynski@saic.com> Date: Thu, 23 May 2013 13:03:24 ------------------------------------------------------------------------------- Thanks a lot. Now I just have to process the results ![][1] [1]: https://jira.toolserver.org/images/icons/emoticons/smile.gif
This bug was imported as RESOLVED. The original assignee has therefore not been set, and the original reporters/responders have not been added as CC, to prevent bugspam. If you re-open this bug, please consider adding these people to the CC list: Original assignee: tim@tim-landscheidt.de CC list: jaroslaw.w.tuszynski@leidos.com, tim@tim-landscheidt.de