Last modified: 2012-07-14 03:44:29 UTC
On the special page Special:UnreviewedPages is no way to select all namespaces. In combination with a category, this is sometimes necessary, because you can have different reviewable pages in a category. Please add a "(all)" to the namespace selector like the namespace selector on Special:PendingChanges. Thanks.
This is not a feature for performance reasons. There is no (page_title) index, only (page_namespace,page_title). We only want to query pages where page_namespace refers to a reviewable namespace, so we want to use the (page_namespace,page_title) index. If we want "all reviewable namespaces", we have to sort/page on (namespace,title) and not just (title) anymore to avoid sorting. If we sorted/paged on page_id, an "(all)" option would be doable, though slower given that the "page_namespace is reviewable" condition would not use an index.
But on PendingChanges this condition is also used, but it makes no problem, because there is no group by. Maybe a parent_id = 0 can help to get the first revision of a page, and than the MIN(rev_timestamp) is cheaper or not needed, because in most cases there is only one revision with parent_id = 0 for a page_id.
PendingChanges uses either the flaggedpages table (pending items are indexed) or the flaggedpages_pending (which is a denormalization) tables. Both of which work well for this purpose, and don't require scanning massive portions of the page table. The number of pending pages tends to be from zero to a few dozen-thousand. Filtering by namespaces is thus easy. The only way to do this well is to page on page_id rather than page_title. It would be confusing to page differently based on whether an namespace is provided, so we would have to always page by page_id, which would be slower if I specify a namespace that's only gets a small portion of edits. UnreviewedPages is really only useful for getting really old pages that still haven't been reviewed. It seems like NewPages would be more useful (filtered for unpatrolled). Maybe that can have a category selector?
(In reply to comment #3) > PendingChanges uses either the flaggedpages table (pending items are indexed) > or the flaggedpages_pending (which is a denormalization) tables. Both of which > work well for this purpose, and don't require scanning massive portions of the > page table. The number of pending pages tends to be from zero to a few > dozen-thousand. Filtering by namespaces is thus easy. > > The only way to do this well is to page on page_id rather than page_title. It > would be confusing to page differently based on whether an namespace is > provided, so we would have to always page by page_id, which would be slower if > I specify a namespace that's only gets a small portion of edits. > > UnreviewedPages is really only useful for getting really old pages that still > haven't been reviewed. It seems like NewPages would be more useful (filtered > for unpatrolled). Maybe that can have a category selector? Another problem with paging on page_id is that it increases the average number of rows to be scanned when "oldest" is selected, since older pages are more likely to be reviewed (whereas paging on page_title is more random). Another trick would be to still page on page_title but do a UNION query (like recentchanges) on all reviewable namespaces. Given how many rows this special page already has to scan, I'd be a bit hesitant to do that either, though it seems to still be fairly fast on dewiki.
Closing until there is more demand for this.