Last modified: 2014-02-19 05:31:12 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T40085, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 38085 - Statistic page not entirely correct
Statistic page not entirely correct
Status: NEW
Product: MediaWiki
Classification: Unclassified
Special pages (Other open bugs)
1.23.0
All All
: Low normal with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-29 15:57 UTC by Andrij
Modified: 2014-02-19 05:31 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Andrij 2012-06-29 15:57:31 UTC
Could anybody explain me, do Ukrainian Wikipedia reach 10 M edits or not? According to http://uk.wikipedia.org/wiki/%D0%A1%D0%BF%D0%B5%D1%86%D1%96%D0%B0%D0%BB%D1%8C%D0%BD%D0%B0:%D0%A1%D1%82%D0%B0%D1%82%D0%B8%D1%81%D1%82%D0%B8%D0%BA%D0%B0 - yes. But according to https://uk.wikipedia.org/w/index.php?oldid=10000000 - not yet (only 9,82 M). Which statistic is true and which is false? I think there is no reason to have true and false statistic at the same time. The true one is preferable.
Comment 1 Andrij 2012-06-29 16:57:53 UTC
*** This bug has been confirmed by popular vote. ***
Comment 2 Chad H. 2012-07-03 11:29:50 UTC
(In reply to comment #0)
> Could anybody explain me, do Ukrainian Wikipedia reach 10 M edits or not?
> According to
> http://uk.wikipedia.org/wiki/%D0%A1%D0%BF%D0%B5%D1%86%D1%96%D0%B0%D0%BB%D1%8C%D0%BD%D0%B0:%D0%A1%D1%82%D0%B0%D1%82%D0%B8%D1%81%D1%82%D0%B8%D0%BA%D0%B0
> - yes. But according to https://uk.wikipedia.org/w/index.php?oldid=10000000 -
> not yet (only 9,82 M). Which statistic is true and which is false? I think
> there is no reason to have true and false statistic at the same time. The true
> one is preferable.

This is a long-known issue with Special:Statistics. They're not perfectly accurate, and never have been. The lower number is the correct number here.

This is probably a duplicate bug.
Comment 3 Andrij 2012-07-03 11:51:06 UTC
Thank you for reply. Then what's the reason for displaying incorrect numbers in Statistic page? The correct one would be preferable, isn't it?
Comment 4 Chad H. 2012-07-03 11:55:55 UTC
(In reply to comment #3)
> Thank you for reply. Then what's the reason for displaying incorrect numbers in
> Statistic page? The correct one would be preferable, isn't it?

Yes, but the number isn't easy to get necessarily.

For large tables (like revision), it's inefficient to do COUNT(*) to get that number. So we estimate.

Other counts like "number of pages" are handled via the site_stats table, which is subject to inaccuracies for other reasons.
Comment 5 Anatoliy Goncharov 2012-07-03 12:06:50 UTC
Well, but why we cannot calculate it by different why?

For example as SELECT MAX(cur_id) FROM cur. It will be more accurate value than COUNT(*) FROM revision.
Comment 6 Antoine "hashar" Musso (WMF) 2012-07-03 12:10:21 UTC
Maybe we could run `initStats.php --active --nowviews --update` on that wiki?
Comment 7 Andrij 2012-07-24 11:11:59 UTC
If this could fix a bug, maybe
Comment 8 Dan Jacobson 2014-02-01 02:54:21 UTC
On http://abj.jidanni.org/index.php?title=Special:Statistics
which uses super fresh up to date Mediawiki,
we note
Content pages	14
Pages
(All pages in the wiki, including talk pages, redirects, etc.)	21

Now we click
http://abj.jidanni.org/index.php?title=Special:AllPages
which is linked to the words "Content pages",
and lo and behold, there are three columns of 22 items each.
Actually the first column is 23 items.
That makes 23+22+22, which is very much more than 14.
This is a blatant bug, no?
Comment 9 Bawolff (Brian Wolff) 2014-02-01 03:02:00 UTC
(In reply to comment #5)
> Well, but why we cannot calculate it by different why?
> 
> For example as SELECT MAX(cur_id) FROM cur. It will be more accurate value
> than
> COUNT(*) FROM revision.

cur is no longer the name of that table, but for your general point, that would include pages that have since been deleted since id numbers do not generally get reused.

(In reply to comment #8)
> On http://abj.jidanni.org/index.php?title=Special:Statistics
> which uses super fresh up to date Mediawiki,
> we note
> Content pages    14
> Pages
> (All pages in the wiki, including talk pages, redirects, etc.)    21
> 
> Now we click
> http://abj.jidanni.org/index.php?title=Special:AllPages
> which is linked to the words "Content pages",
> and lo and behold, there are three columns of 22 items each.
> Actually the first column is 23 items.
> That makes 23+22+22, which is very much more than 14.
> This is a blatant bug, no?

Content pages are only pages with links on them in main namespace (See [[Manual:$wgArticleCountMethod]]). Perhaps some of your pages do not meet that definition of a content page.
Comment 10 Dan Jacobson 2014-02-01 03:21:05 UTC
Well,

1) No user could have ever guessed such a definition,
therefore please add a mouseover, explaining such definition.

2) A speech by Abraham Lincoln might not contain any links, but will
still contain content, whereas a spamfarm page might be totally links,
but most would agree devoid of content. Therefore the wording is
misleading.

3) Nowhere else on the statistics page do we find anything close to 67!
Therefore the most important statistic is not presented.
Comment 11 Dan Jacobson 2014-02-01 03:24:37 UTC
P.S., I clicked [[Manual:$wgArticleCountMethod]] and it said no such page, and suggested "Did you mean: Manual:$articlecountupdate" which upon clicking doesn't exist either. (Bug: it shouldn't suggest things that it knows don't exist.)
Comment 12 Bawolff (Brian Wolff) 2014-02-01 03:44:42 UTC
Sorry, I meant [[mw:Manual:$wgArticleCountMethod]].

The definition is meant to exclude stubs from wikipedia (Like most things MediaWiki, especially features that go way back, its a bit wikipedia centric).

I would be fine with having explanatory text (I would suggest just having it small under the words Content pages). File a separate bug for that.

--------


> 
> 3) Nowhere else on the statistics page do we find anything close to 67!
> Therefore the most important statistic is not presented.

You're right, and that's obviously wrong. I notice some of those pages date back from 2005 - hard to know if its some old bug or what. It should also be noted that some import scripts like the java dump importer (probably) don't update the count properly. Or perhaps there are other bugs with the statistics counter.
Comment 13 Dan Jacobson 2014-02-01 03:56:21 UTC
Instead of me reporting more bugs, the whole statistics page needs to be rethought from the point of view of the man in the street.
Comment 14 Andre Klapper 2014-02-03 13:55:27 UTC
Likely nobody will ever work on that part if it's hidden in a comment in some bug report, instead of a clear and separate bug report.
Comment 15 Dan Jacobson 2014-02-19 05:31:12 UTC
Indeed.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links