Last modified: 2014-04-07 18:19:50 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T60805, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 58805 - Add robot policy for each namespace to the dumps
Add robot policy for each namespace to the dumps
Status: NEW
Product: Datasets
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Ariel T. Glenn
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-12-21 17:57 UTC by Matthew Flaschen
Modified: 2014-04-07 18:19 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Matthew Flaschen 2013-12-21 17:57:21 UTC
For individual pages, the __NOINDEX__ and __INDEX__ page properties (available in page_props.sql lowercase and without the underscores) can be used to determine overrides.

However, the baseline robot policy for each namespace should also be dumped.  For each namespace, this can be determined by starting with [[mw:Manual:$wgDefaultRobotPolicy]] and then overriding it with [[mw:Manual:$wgNamespaceRobotPolicies]].  For convenience (it's not a significant storage cost since there are generally not many namespaces), it should state the policy for each namespace, even the ones that simply inherit $wgDefaultRobotPolicy.

I think it would be simplest to just use robotpolicy="noindex,nofollow" (or whatever the actual policy is) on each <namespace> element, since that's the format used in the HTML output and the configuration variables.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links