Last modified: 2014-01-03 15:23:20 UTC
This issue was converted from https://jira.toolserver.org/browse/DBQ-11. Summary: Pages for users that don't exist Issue type: Task - A task that needs to be done. Priority: Major Status: Done Assignee: Autocracy <jeff@storyinmemo.com> ------------------------------------------------------------------------------- From: MZMcBride <mzmcbride@gmail.com> Date: Wed, 06 Feb 2008 22:27:42 ------------------------------------------------------------------------------- If possible, I would like to get a list of all pages in NS:2 and NS:3 on en.wiki that are not redirects and that do not correspond to a registered user. For example, if a page existed for [[User talk:MZMcbride]] that someone accidentally (or intentionally) created, it would be listed in the results of the query because User:MZMcbride does not exist on en.wiki. Also, if possible, it would list subpages if their roots do not exist. So, if a page like [[User:MZMcbride/sandbox]] existed, it would be listed in the results of the query. Thanks!
------------------------------------------------------------------------------- From: Kylu <kylu@ts.wikimedia.org> Date: Thu, 07 Feb 2008 06:19:37 ------------------------------------------------------------------------------- Exceptions: * IP pages (no registered user), yet show "IP" pages that use invalid octets. * Nonregistered user pages that are redirects are useful and a list also generated: Instead of simply creating a redirect page, users should be encouraged to either: 1) register the account as a doppleganger, or 2) spell other users' names correctly ![][1] [1]: https://jira.toolserver.org/images/icons/emoticons/smile.gif
------------------------------------------------------------------------------- From: Bryan Tong Minh <bryan@tools.wikimedia.de> Date: Thu, 07 Feb 2008 19:47:16 ------------------------------------------------------------------------------- I executed the following query: SELECT CONCAT('* [[{{subst:ns:', page_namespace, '}}:', page_title, ']]') FROM page LEFT JOIN user ON page_title = user_name WHERE page_namespace IN (2, 3) AND user_name IS NULL AND INSTR(page_title, '/') = 0; It however had some more results than I expected; uncompressed the file is almost 70 MB. Anyway, see the results here: http://tools.wikimedia.de/~bryan/stats/dbquery/no_user_name.txt.bz2
------------------------------------------------------------------------------- From: MZMcBride <mzmcbride@gmail.com> Date: Fri, 08 Feb 2008 00:30:04 ------------------------------------------------------------------------------- It seems that the list generated includes both redirects and user pages of users that exist. For example, it includes pages like [[User:JJ the Crusader]] and http://en.wikipedia.org/w/index.php?title=User%3A%D0%A1%D0%B0%D1%81%D1%83%D1%81l%D0%B5&redirect=no .
------------------------------------------------------------------------------- From: Bryan Tong Minh <bryan@tools.wikimedia.de> Date: Fri, 08 Feb 2008 13:06:04 ------------------------------------------------------------------------------- Oh yes, forgot... page_title is having spaces replaced by underscores, while user_name isn't. Running new query and and filtering redirects.
------------------------------------------------------------------------------- From: Bryan Tong Minh <bryan@tools.wikimedia.de> Date: Sun, 10 Feb 2008 19:14:39 ------------------------------------------------------------------------------- Query is too heavy. I will consider doing this query again once everything is stable again.
------------------------------------------------------------------------------- From: MZMcBride <mzmcbride@gmail.com> Date: Wed, 13 Feb 2008 00:29:51 ------------------------------------------------------------------------------- Hmm... seems this query is simply too heavy. Could I get a list of all pages (subpages included) in NS:2 and NS:3 on en.wiki instead? It can be in any format, though it may be preferable to split the files. Whichever way works.
------------------------------------------------------------------------------- From: Autocracy <jeff@storyinmemo.com> Date: Mon, 10 Mar 2008 13:09:34 ------------------------------------------------------------------------------- Spoke with the requester; says he got what he needed in the end. Moving to "Resolved."
------------------------------------------------------------------------------- From: Autocracy <jeff@storyinmemo.com> Date: Mon, 10 Mar 2008 13:10:09 ------------------------------------------------------------------------------- User got the list he requested at the end. Is resolved.
This bug was imported as RESOLVED. The original assignee has therefore not been set, and the original reporters/responders have not been added as CC, to prevent bugspam. If you re-open this bug, please consider adding these people to the CC list: Original assignee: bugzilla@storyinmemo.com CC list: b@mzmcbride.com, Bryan.TongMinh@Gmail.com, bugzilla@storyinmemo.com