Last modified: 2014-10-25 13:28:32 UTC
The robots.txt rules are unnecessarily restrictive. As bugzilla is being deprecated, and only a portion of its content migrated to phabricator, it's essential that we allow third parties to do their job. All crawlers, or at least ia_archiver (wayback machine), should be allowed to crawl: 1) any content which 2) doesn't specifically cause load issues and 3) is not being semantically migrated to phabricator. Ideally we'd drop requirement (3) but let's start somewhere. Example URLs which shouldn't be blacklisted: * /page.cgi?id=voting/bug.html* * /duplicates.cgi* * /report.cgi* (unless load) * /weekly-bug-summary.cgi* * /describecomponents.cgi* In fact, is there any reason not to allow everything, minus: * /show_bug.cgi * /showdependencytree.cgi * /query.cgi ?
Duplicate of bug 13881?