Last modified: 2012-11-06 11:40:25 UTC
In kshwiki, we seem to have an issue with the article count. 10635 non-redirect pages in the `pages` table 10596 pages according to query in the maintenance script [1] 9972 shown in Special:Statistics [1] http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/maintenance/updateArticleCount.inc?revision=50942&view=markup Here are queries made on the toolserver data base, with results: mysql> use kshwiki_p ; mysql> SELECT count( `page_id` ) FROM `page` \ WHERE ( `page_namespace` ) = 0 \ AND ( `page_is_redirect` = 0 ) ; +--------------------+ | count( `page_id` ) | +--------------------+ | 10635 | +--------------------+ 1 row in set (10.90 sec) mysql> SELECT COUNT(DISTINCT page_namespace, page_title) \ AS pagecount FROM `page` , `pagelinks` \ WHERE `pl_from` = `page_id` AND `page_namespace` = 0 \ AND `page_is_redirect` = 0 AND `page_len` > 0 ; +-----------+ | pagecount | +-----------+ | 10596 | +-----------+ 1 row in set (2.93 sec) mysql> SELECT `ss_good_articles` FROM `site_stats` ; +------------------+ | ss_good_articles | +------------------+ | 9972 | +------------------+ 1 row in set (0.00 sec) Imho, the difference is likely at least in part caused by a software update. The old parser accepted a some comments inside redirect pages, which the new parser does not. Thus, some existing such pages were not included in the good pages count with the old parser. Now, when we detect them, we correct them, the new parser sees a non-redirect becoming a redirect, and it decrements the count of good pages. This may well have happened ~620 times, at least it appers to be a very reasonable figure. Another issue which I observed several months ago and failed to report: in order to duplicate two pages, including their edit history for a page split, either I exported and re-imported them with new page titles, or I exported, renamed, and reimported them with the original page titles. This did not increase the page count.
Now, the "active users" count in the ksh Wikipedia became -1, while the "good articles" were 106xx, for a short time, at least. Indeed, the above diagnosis about the parser difference, and its results are correct. With the problem diagnosed, and a newly made "redirect" pywikipediabot page generator available, we currently run bot fixing all those problem redirects. It made the "good article" counter fall below zero some time around the transit from July 31th to August 1st (UTC) +/- an hour, I believe. This seems to have caused the "good articles" counter to be re-evaluated, and the "active users" count to be set to -1 at the same time. (It ist neither useful, nor necessary, to correct statistics manually while the bot is still running, I'll file an extra bug, when it is done)
See also bug 20017 See also bug 10834
Practical impact on kshwiki is addressed in Bug 20143 now, once the bot fixed a the problematic pages. This may mean that this bug should be closed, but a possibly better solution was to add the fix to the site update script, when switching to the new parser. Here is the bots command line: python pywikipedia/replace.py -v -pt:6 -log -regex -nocase -always \ -redirectonly:! -query:500 -summary:"Fix redirects for new parser" \ '^(#redirect[^]|]+)\|' '\1]]\n\n|' Note: This commandline does not remove comments from the 1st argument of the redirect. We did not have any, but they may cause trouble, too. The same holds for comments between "redirect" and the opening "[[" of the redirect target. Note also: This needs to be adapted to each localized versions + the generic one of the magic word "redirect" for wikis that have any.
r88250 changes Special:Undelete to "Only increment the page count if the page has been created; also simplified a bit the code". Might help fixing the issue :)
It now says 2,626 articles vs. 2681 found by a Toolserver query: I'm calling this fixed by the several recent counting method fixes. Please open new more specific bugs if other issues arise/stay.