Last modified: 2013-02-13 12:15:17 UTC
Per https://en.wikipedia.org/wiki/Wikipedia_talk:PC2012/RfC_2#Comment_-_The_Google_results_will_show_the_.22vandalized.22_versions This undermines part of the goal of Pending changes of making vandalized page less visible to the general public.
Somewhere in the linked discussions, Jdforrester (WMF) wrote: > Unfortunately I've confirmed that the PC work did not include modifying the > output of the API feeds in this way (so yes, GoogleBot et al. will get the > "latest" version regardless of the page's PC state). Which "API feeds", specifically, do Googlebot and other search engine bots use?
(In reply to comment #1) > Somewhere in the linked discussions, Jdforrester (WMF) wrote: > > Unfortunately I've confirmed that the PC work did not include modifying the > > output of the API feeds in this way (so yes, GoogleBot et al. will get the > > "latest" version regardless of the page's PC state). > > Which "API feeds", specifically, do Googlebot and other search engine bots use? I do not know which API feeds but, it hold be restricted to all feeds unless the account or bot that is requesting the latest revision has the reviewer bit. This is for security.
(In reply to comment #2) > I do not know which API feeds but, it hold be restricted to all feeds unless > the account or bot that is requesting the latest revision has the reviewer bit. > This is for security. What "security"? Anyone, even people without accounts, can view the most recent revision of the article. This is by design. The ''only'' thing that pending changes does is change which revision is shown by default (i.e. when someone visits "http://en.wikipedia.org/wiki/Title").
(In reply to comment #3) > (In reply to comment #2) > > I do not know which API feeds but, it hold be restricted to all feeds unless > > the account or bot that is requesting the latest revision has the reviewer bit. > > This is for security. > > What "security"? Anyone, even people without accounts, can view the most recent > revision of the article. This is by design. The ''only'' thing that pending > changes does is change which revision is shown by default (i.e. when someone > visits "http://en.wikipedia.org/wiki/Title"). That's not according Wikipedia. It's only supposed to show the most recent approved revision and hide the unstable one until approved. Please elaborate.
(In reply to comment #4) > That's not according Wikipedia. It's only supposed to show the most recent > approved revision and hide the unstable one until approved. Please elaborate. No, it's not. It's supposed to show the most recent approved revision ''by default'', but there is nothing in the specification or the implementation on enwiki that says it's supposed to make the unreviewed revisions unviewable. Try it out. Go to [[en:Wikipedia:Pending changes/Testing/4]] and either make an edit (if you're not a reviewer) or unapprove the most recent edit (if you are), and then look at the page while logged out. See the "Pending changes" tab between "Read" and "Edit"? Click that, and then click where it tells you there are unreviewed edits. Or click the "Edit" button and see how it tells you there are unreviewed edits and includes them in the edit box. Or click the History tab and just go to the unreviewed revision directly.
Clarified the title to turn it into an 'ask'. This is aimed not just at GoogleBot, but clearly that's an important criterion.
(In reply to comment #1) > Somewhere in the linked discussions, Jdforrester (WMF) wrote: > > Unfortunately I've confirmed that the PC work did not include modifying the > > output of the API feeds in this way (so yes, GoogleBot et al. will get the > > "latest" version regardless of the page's PC state). > > Which "API feeds", specifically, do Googlebot and other search engine bots use? I don't know what feed this is either, but there are some vague outdated meta pages that allowed to some extension or feed somewhere. Maybe brion knows?
(In reply to comment #6) > Clarified the title to turn it into an 'ask'. This is aimed not just at > GoogleBot, but clearly that's an important criterion. Clarified slightly more. It seems unlikely to me that "API feeds" has anything to do with the MediaWiki API, since I doubt Google is constantly hitting and then parsing action=query&list=recentchanges.