Last modified: 2014-03-25 17:51:58 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T63133, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 61133 - Let Internet Archive's Wayback machine archive tools


Summary:	Let Internet Archive's Wayback machine archive tools

Status:	RESOLVED WONTFIX

Product:	Wikimedia Labs
Classification:	Unclassified
Component:	tools (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	Marc A. Pelletier

URL:	http://tools.wmflabs.org/robots.txt
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2014-02-10 11:09 UTC by Nemo
Modified:	2014-03-25 17:51 UTC (History)
CC List:	2 users (show)

See Also:	61132
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Nemo 2014-02-10 11:09:50 UTC

See bug 56893 for instructions.

Comment 1 Tim Landscheidt 2014-02-10 11:24:08 UTC

What does "archive tools" mean?  Tools (for the most part) process input and generate output from that.  A spider doesn't provide input and thus doesn't get output.

For "archiving tools" (i. e. the interesting bit, the processor), their source code needs to be put in a repository.  But neither Internet Archive nor any other spider can access private source code from the web.

Comment 2 Nemo 2014-02-10 11:31:29 UTC

(In reply to comment #1)
> What does "archive tools" mean?  Tools (for the most part) process input and
> generate output from that.  A spider doesn't provide input and thus doesn't
> get
> output.

Which is why this operation is inexpensive but will allow Wayback to archive URLs referenced from the web or by users.

Comment 3 Marc A. Pelletier 2014-03-25 17:50:11 UTC

Pages with dynamically generated content make no semantic sense to archive, and the cost in resources of allowing spidering of tool URLs is prohibitive.

Tool Labs is not intended for long-lived mostly static content (which is what archiving makes sense for); that data belongs on a wiki -- possibly generated and put there /by/ tools.

Comment 4 Nemo 2014-03-25 17:51:58 UTC

RObbish

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links