Last modified: 2013-03-18 16:09:51 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T35304, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 33304 - Inconsistencies in how lists of protected pages are found via API
Inconsistencies in how lists of protected pages are found via API
Status: RESOLVED WORKSFORME
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
1.18.x
All All
: Normal normal (vote)
: ---
Assigned To: Sam Reed (reedy)
: platformeng
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-12-21 20:35 UTC by Bergi
Modified: 2013-03-18 16:09 UTC (History)
12 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bergi 2011-12-21 20:35:39 UTC
I was asked to build a list of protected pages (at de:WP), sorted by the block date [1]. As the protection property of prop=info doesn't provide that data, I had to query the log for each page. OK, adding this information to the info module might be another bug, but doing the query I found:

* pages listed by api.php?action=query&list=allpages&apprtype=move|edit (and having protection properties in prop=info), but having no log entries. Examples are [[de:Amerika]], [[de:Neonazismus]] and others. They can be found in the list [1] with "undefined" user/timestamp/comment.
* pages having more than two protection properties in prop=info. This happened over 300 times, if it is intended please explain. For example [[de:Nationalsozialistische Deutsche Arbeiterpartei]] was only one time protected pursuant to the log, but has 3 edit and 3 move protections (api.php?action=query&prop=info&inprop=protection&titles=Nationalsozialistische%20Deutsche%20Arbeiterpartei). 
* pages listed at [[de:Spezial:Geschützte Seiten]], but not in the api query. For example [[de:Flatulenz]], which was protected, unprotected and re-protected according to the log and has two protection properties in prop=info, but is not listed in list=allpages: api.php?action=query&list=allpages&apprtype=move|edit&apprefix=Fl&prop=info&inprop=protection&titles=Flatulenz.

Could all these cases be caused by protect actions in previous MW versions (and conversion scripts missed them)? Are they intended? Should I file individual bugs for each?

[1]: The result of my query can be found at [[de:Wikipedia:Liste der am längsten geschützten Artikel]]
Comment 1 Bergi 2011-12-22 08:46:50 UTC
* There are also pages which are listed as protected by api's allpages (and wich have protection properties in prop=info as also some matching log entries), but which are a) editable and b) not listed on Special:Protected Pages. Examples for that are http://de.wikipedia.org/w/api.php?action=query&prop=info&inprop=protection&titles=Lansdowne%20Portrait|Schweizerische%20Käseunion|Bahnhof%20Eisenach|Eisenbahn%20in%20Thüringen
What is the difference in the database query between Special:Protected Pages and api.php?action=query&list=allpages&apprtype=move|edit?
Comment 2 db [inactive,noenotif] 2011-12-22 19:19:26 UTC
For the multiple entries see bug 28751

Protection moves are logged since some version ago. Have a look at the move log to find the protection.

It is not comprehensible why
api.php?action=query&generator=allpages&prop=info&inprop=protection&gapprefix=Flatulenz

list the page with protection properties, but 

api.php?action=query&generator=allpages&prop=info&inprop=protection&gapprefix=Flatulenz&gapprtype=edit|move

does not list the page at all. The page is protected. Maybe some problems with the old way (or with the maintenance script), which protection are stored? Need a look into the database.

The page from comment 1 are move protected only. There are editable, but not movable and there are listed:

api.php?action=query&generator=allpages&prop=info&inprop=protection&gapprefix=Bahnhof Eisenach&gapprtype=move

The difference is that Special:ProtectedPages can only filter for one type.
Comment 3 Siddhartha Ghai 2012-03-27 09:16:58 UTC
There are other problems too. Querying [1] results in a) a lot of duplicates and b) semi-protected pages are also listed along with full protected ones.

So far I haven't been able to check if it's respecting the indefinite parameter or not.

Using it as a generator [2] results in removal of the doubles, but does not change the inclusion of semi-protected pages in the list.

Example: [[hi:अंकोरवाट मंदिर]] which is (as of now) semi-protected indefinitely, is included both times. See its log: [3].

[1]: http://hi.wikipedia.org/w/api.php?action=query&list=allpages&apprlevel=sysop&apprexpiry=indefinite&aplimit=500

[2]: http://hi.wikipedia.org/w/api.php?action=query&generator=allpages&gapprlevel=sysop&gapprexpiry=indefinite&gaplimit=500&prop=revisions&rvprop=timestamp

[3]: http://hi.wikipedia.org/w/index.php?title=%E0%A4%B5%E0%A4%BF%E0%A4%B6%E0%A5%87%E0%A4%B7%3ALog&page=%E0%A4%85%E0%A4%82%E0%A4%95%E0%A5%8B%E0%A4%B0%E0%A4%B5%E0%A4%BE%E0%A4%9F_%E0%A4%AE%E0%A4%82%E0%A4%A6%E0%A4%BF%E0%A4%B0
Comment 4 Sam Reed (reedy) 2012-06-19 21:28:44 UTC
https://de.wikipedia.org/w/api.php?action=query&prop=info&inprop=protection&titles=Nationalsozialistische%20Deutsche%20Arbeiterpartei&format=jsonfm

mysql> SELECT pr_page, pr_type, pr_level, pr_expiry, pr_cascade, page_namespace, page_title FROM page_restrictions, page WHERE page_id=pr_page AND pr_page=3627;
+---------+---------+---------------+-----------+------------+----------------+------------------------------------------------+
| pr_page | pr_type | pr_level      | pr_expiry | pr_cascade | page_namespace | page_title                                     |
+---------+---------+---------------+-----------+------------+----------------+------------------------------------------------+
|    3627 | edit    | autoconfirmed | infinity  |          0 |              0 | Nationalsozialistische_Deutsche_Arbeiterpartei |
|    3627 | move    | autoconfirmed | infinity  |          0 |              0 | Nationalsozialistische_Deutsche_Arbeiterpartei |
+---------+---------+---------------+-----------+------------+----------------+------------------------------------------------+
2 rows in set (0.00 sec)
Comment 5 Sam Reed (reedy) 2012-08-09 16:35:48 UTC
This bug is somewhat of a mess.

Multiple different points have been raised, and it makes it somewhat difficult to work out what's what.

Pages not appearing as protected in one place, but are in another is one issue.

Pages having multiple protection issues is something different entirely. A quick look at the ApiQueryInfo code suggests it's likely to be the check of protections via different methods. Probably simply fixed by rather than just "blindly" adding another protection to the array, we key it with something (level maybe?) so duplicates won't be inserted.


Can we move these to seperate bugs (and/or keep some of it here)? People then going "and then there is also this similar issue" and dumping more information onto the same bug isn't exactly helpful either.
Comment 6 Brad Jorsch 2013-01-30 01:43:04 UTC
(In reply to comment #5)
> This bug is somewhat of a mess.

That's an understatement.

(In reply to comment #0)
> * pages listed by api.php?action=query&list=allpages&apprtype=move|edit (and
> having protection properties in prop=info), but having no log entries.
> Examples
> are [[de:Amerika]], [[de:Neonazismus]] and others. They can be found in the
> list [1] with "undefined" user/timestamp/comment.

I note that both are very old pages. According to [[en:Wikipedia:Protection log]], protections were not automatically logged before 23 December 2004. Is it possible those pages were protected before that date (or the corresponding date for dewiki, if it differs)? If so, this is not a bug.

> * pages having more than two protection properties in prop=info. This
> happened
> over 300 times, if it is intended please explain. For example
> [[de:Nationalsozialistische Deutsche Arbeiterpartei]] was only one time
> protected pursuant to the log, but has 3 edit and 3 move protections
> (api.
> php?action=query&prop=info&inprop=protection&titles=Nationalsozialistische%20
> Deutsche%20Arbeiterpartei). 

As mentioned in comment 2, this is bug 28751.

> * pages listed at [[de:Spezial:Geschützte Seiten]], but not in the api query.
> For example [[de:Flatulenz]], which was protected, unprotected and
> re-protected
> according to the log and has two protection properties in prop=info, but is
> not
> listed in list=allpages:
> api.
> php?action=query&list=allpages&apprtype=move|edit&apprefix=Fl&prop=info&inpro
> p=protection&titles=Flatulenz.

That's an actual bug: old protections store indefinite protection with pr_expiry = NULL, but the query used by list=allpages does not take this into account.

Gerrit change #46662 should fix it.

(In reply to comment #1)
> * There are also pages which are listed as protected by api's allpages (and
> wich have protection properties in prop=info as also some matching log
> entries), but which are a) editable and b) not listed on Special:Protected
> Pages. Examples for that are
> http://de.wikipedia.org/w/api.
> php?action=query&prop=info&inprop=protection&titles=Lansdowne%20Portrait|Schw
> eizerische%20Käseunion|Bahnhof%20Eisenach|Eisenbahn%20in%20Thüringen
> What is the difference in the database query between Special:Protected Pages
> and api.php?action=query&list=allpages&apprtype=move|edit?

As mentioned in comment 2, these are not edit protected, only move protected. Special:ProtectedPages (now) has a dropdown to select the type of protection to search for; these pages will show up if you choose "move" rather than "edit".

(In reply to comment #3)
> There are other problems too. Querying [1] results in a) a lot of duplicates

Yeah, that shouldn't be happening. For some reason the module is only adding DISTINCT to the query when apprtype is used, while your query uses only apprexpiry.

Gerrit change #46665 should fix it.

> and b) semi-protected pages are also listed along with full protected ones.

apprlevel is only effective when combined with apprtype. This is documented.

The error could be detected and reported when apprlevel is used with apprexpiry but not apprtype, but that would break clients that use queries like those in comment 3 and I'm not sure it's worth breaking backwards compatibility.
Comment 7 Andre Klapper 2013-03-18 16:09:51 UTC
Brad, thanks for the detailed analysis here, much apppreciated.

Updating and summarizing based on comment 6 by Brad:

The first issue (before 2004) might not be a software bug.
The second issue is already handled in another bug report.
The two following issues have received patches that have been merged into the codebase, so they should be fixed.
The last issue is not a bug either and expected behavior.

So to me this report seems to be 1/5 INVALID, 2/5 FIXED, 1/5 DUPLICATE and 1/5 WONTFIX (in that order).

I'm setting RESOLVED WORKSFORME as that is between everything.

bergi: Please leave a comment here with exact steps to reproduce if any of the described (valid) issues still happens to you. Thanks!

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links