Last modified: 2012-05-31 13:10:34 UTC
Following community discussion (see http://de.wikipedia.org/wiki/Wikipedia:Meinungsbilder/Indizierung_von_Benutzerseiten ) please disallow search engines from indexing the user namespace for the German Wikipedia by adding NS_USER => 'noindex,follow' to $wgNamespaceRobotPolicies accordingly.
You can add entries to a page: MediaWiki:Robots.txt There are already similar lines: # Benutzerdiskussionsseiten Disallow: /wiki/Benutzer_Diskussion: Disallow: /wiki/Benutzer_Diskussion%3A Disallow: /wiki/User_talk: Disallow: /wiki/User_talk%3A
Please reopen of you need help with this.
This will prevent them from being indexed by adding the magic word __INDEX__ to the page source, right? Changing $wgNamespaceRobotPolicies seems to be generally favored by the community and was also done in similar cases (for example see bug 16247) (Can't reopen for some reason)
No, this will alter robots.txt for dewiki.
So altering robots.txt (or MediaWiki:Robots.txt which is the same as I know) will still allow individual pages to be indexed by adding __INDEX__?
Right, if you want to whitelist separate pages it won't work. Reopened.
(In reply to comment #6) > Right, if you want to whitelist separate pages it won't work. Reopened. Couldn't you add whitelisted pages to MediaWiki:Robots.txt? http://en.wikipedia.org/wiki/Robots_exclusion_standard#Allow_directive I realize this is not as scalable as __INDEX__, but maybe this feature could be added?
(In reply to comment #7) > I realize this is not as scalable as __INDEX__, but maybe this feature could be > added? Actually, that is probably something a bot could do, right? Watch for new __INDEX__ uses and add them to MW:Robots.txt
Sure, we could (although we would probably need to give a bot admin rights and I'm not sure we want that). But why not use $wgNamespaceRobotPolicies directly? Is there some technical problem I should know of? According to http://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php (not sure if I'm looking at the right file) there are already a lot of namespaces there in addition to the robots.txt one.
Please add NS_USER => 'noindex,follow' to the dewiki part of wgNamespaceRobotPolicies in InitialiseSettings.php. That is the easiert way and already used by other namespaces on dewiki and other wikis. There is no reason to use a harder other technical way, when there is this easy way. Thanks.
Just to make this clear: the community decision has been made under the premise that it is possible to opt in indexing (e.g. by __INDEX__) It is not acceptable to implement any solution that does not provide this requirement!
Next week is gone, please give a comment about the status. Thanks.
Next week is over, please, shell user or operator, add a comment or change the status of this bug, if nobody is there, to fix it or you think, that is this already fixed. Thanks for a response.
Line added to InitialiseSettings.php with https://gerrit.wikimedia.org/r/#/c/9469/ Now it needs someone to merge and deploy. (In reply to comment #11) > Just to make this clear: the community decision has been made under the premise > that it is possible to opt in indexing (e.g. by __INDEX__) > > It is not acceptable to implement any solution that does not provide this > requirement! Overriding with __INDEX__ is still possible. Tested on my local wiki.
Deployed by Reedy today. Thanks :)