Last modified: 2010-05-15 15:41:15 UTC
Go ahead, in LocalSettings.php put $wgNamespaceRobotPolicies = array(NS_WHATEVER => 'noindex,nofollow') And then edit some [[Test page]] where you put a link to [[Whatever:Something]]. And make sure Whatever:Something exists. Now look at the rendering of [[Test page]]. Does the link to Whatever:Something have a nofollow added? No. Should it? Yes Does Whatever:Something have <meta name="robots" content="noindex,nofollow" /> in its header? Yes. Is that good enough? Perhaps, but why cause the search engine's extra GET in the first place, only to laugh in its face "Ha ha ha, fooled 'ya, no food here!"? Don't combine the several related bugs I sent today into one as in the current software state, they are certainly fixed faster piecemeal. What about http://meta.wikimedia.org/wiki/Robots.txt saying: The only way to keep a URL out of Google's index is to let Google crawl the page and see a meta tag specifying robots="noindex". Although this meta tag is already present on the edit page HTML template, Google does not spider the edit pages (because they are forbidden by robots.txt) and therefore does not see the meta tag. So hhmmmm, not sure. P.S. I was going to use $wgNamespaceRobotPolicies = array( NS_SPECIAL => 'noindex,nofollow', but as my site depends on search engines indexing Special:Allpages to get its pages indexed, I will back off to just NS_SPECIAL => 'noindex', I suppose! But wait, SpecialPage.php already has $wgOut->setRobotPolicy( "noindex,nofollow" ); Indeed, I was expecting Special:Allpages to be the main way for all search engines to index my site, as yes I also have lots of categories, but the are empty pages. I suppose I must write zero bytes into those categories to make them not "edit" links so they can be followed... OK, that's enough thinking for one day.