Last modified: 2013-02-13 12:15:17 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T43003, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 41003 - Pages with FlaggedRevisions protection set should give the latest-flagged version through search engine "API feeds"
Pages with FlaggedRevisions protection set should give the latest-flagged ver...
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
FlaggedRevs (Other open bugs)
master
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-13 14:25 UTC by Derk-Jan Hartman
Modified: 2013-02-13 12:15 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Derk-Jan Hartman 2012-10-13 14:25:57 UTC
Per https://en.wikipedia.org/wiki/Wikipedia_talk:PC2012/RfC_2#Comment_-_The_Google_results_will_show_the_.22vandalized.22_versions

This undermines part of the goal of Pending changes of making vandalized page less visible to the general public.
Comment 1 Brad Jorsch 2012-10-13 14:47:43 UTC
Somewhere in the linked discussions, Jdforrester (WMF) wrote:
> Unfortunately I've confirmed that the PC work did not include modifying the
> output of the API feeds in this way (so yes, GoogleBot et al. will get the
> "latest" version regardless of the page's PC state).

Which "API feeds", specifically, do Googlebot and other search engine bots use?
Comment 2 Cyberpower678 2012-10-14 16:01:16 UTC
(In reply to comment #1)
> Somewhere in the linked discussions, Jdforrester (WMF) wrote:
> > Unfortunately I've confirmed that the PC work did not include modifying the
> > output of the API feeds in this way (so yes, GoogleBot et al. will get the
> > "latest" version regardless of the page's PC state).
> 
> Which "API feeds", specifically, do Googlebot and other search engine bots use?

I do not know which API feeds but, it hold be restricted to all feeds unless the account or bot that is requesting the latest revision has the reviewer bit.  This is for security.
Comment 3 Brad Jorsch 2012-10-14 16:31:27 UTC
(In reply to comment #2)
> I do not know which API feeds but, it hold be restricted to all feeds unless
> the account or bot that is requesting the latest revision has the reviewer bit.
>  This is for security.

What "security"? Anyone, even people without accounts, can view the most recent revision of the article. This is by design. The ''only'' thing that pending changes does is change which revision is shown by default (i.e. when someone visits "http://en.wikipedia.org/wiki/Title").
Comment 4 Cyberpower678 2012-10-14 18:02:57 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > I do not know which API feeds but, it hold be restricted to all feeds unless
> > the account or bot that is requesting the latest revision has the reviewer bit.
> >  This is for security.
> 
> What "security"? Anyone, even people without accounts, can view the most recent
> revision of the article. This is by design. The ''only'' thing that pending
> changes does is change which revision is shown by default (i.e. when someone
> visits "http://en.wikipedia.org/wiki/Title").

That's not according Wikipedia.  It's only supposed to show the most recent approved revision and hide the unstable one until approved.  Please elaborate.
Comment 5 Brad Jorsch 2012-10-14 19:58:18 UTC
(In reply to comment #4)
> That's not according Wikipedia.  It's only supposed to show the most recent
> approved revision and hide the unstable one until approved.  Please elaborate.

No, it's not. It's supposed to show the most recent approved revision ''by default'', but there is nothing in the specification or the implementation on enwiki that says it's supposed to make the unreviewed revisions unviewable.

Try it out. Go to [[en:Wikipedia:Pending changes/Testing/4]] and either make an edit (if you're not a reviewer) or unapprove the most recent edit (if you are), and then look at the page while logged out. See the "Pending changes" tab between "Read" and "Edit"? Click that, and then click where it tells you there are unreviewed edits. Or click the "Edit" button and see how it tells you there are unreviewed edits and includes them in the edit box. Or click the History tab and just go to the unreviewed revision directly.
Comment 6 James Forrester 2012-10-17 17:04:02 UTC
Clarified the title to turn it into an 'ask'. This is aimed not just at GoogleBot, but clearly that's an important criterion.
Comment 7 Aaron Schulz 2012-10-17 17:06:01 UTC
(In reply to comment #1)
> Somewhere in the linked discussions, Jdforrester (WMF) wrote:
> > Unfortunately I've confirmed that the PC work did not include modifying the
> > output of the API feeds in this way (so yes, GoogleBot et al. will get the
> > "latest" version regardless of the page's PC state).
> 
> Which "API feeds", specifically, do Googlebot and other search engine bots use?

I don't know what feed this is either, but there are some vague outdated meta pages that allowed to some extension or feed somewhere. Maybe brion knows?
Comment 8 Brad Jorsch 2012-10-17 19:05:41 UTC
(In reply to comment #6)
> Clarified the title to turn it into an 'ask'. This is aimed not just at
> GoogleBot, but clearly that's an important criterion.

Clarified slightly more. It seems unlikely to me that "API feeds" has anything to do with the MediaWiki API, since I doubt Google is constantly hitting and then parsing action=query&list=recentchanges.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links