Last modified: 2013-02-25 19:04:57 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T47347, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 45347 - Wikimedia sites should provide humans.txt file
Wikimedia sites should provide humans.txt file
Status: RESOLVED WONTFIX
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
wmf-deployment
All All
: Lowest enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
http://humanstxt.org/Standard.html#mo...
: easy
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-25 08:41 UTC by Mathias Schindler
Modified: 2013-02-25 19:04 UTC (History)
10 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Mathias Schindler 2013-02-25 08:41:44 UTC
http://en.wikipedia.org/wiki/Humans.txt is the equivalent to robots.txt, both are not defined by a standard.

The to be created file en.wikipedia.org/humans.txt should briefly describe the page and invite contributors.
Comment 1 MZMcBride 2013-02-25 08:50:01 UTC
[[wikitech:robots.txt]] explains how https://en.wikipedia.org/robots.txt and others are currently generated. A similar approach here may make sense.

The relevant script (robots.php) is here: <https://gerrit.wikimedia.org/r/gitweb?p=operations/mediawiki-config.git;a=blob;f=live-1.5/robots.php;hb=HEAD>.
Comment 2 p858snake 2013-02-25 08:58:52 UTC
"Developers are free to include whatever content they wish. The primary goal is to acknowledge contributors to a site."

I don't really see a point for us to even have this file? Care to provide some more context?
Comment 3 Dereckson 2013-02-25 09:17:41 UTC
[ humans.txt standard ]

First, humans.txt HAS a proposed standard whcih matches this presentation: http://humanstxt.org/humans.txt

Some sites doesn't respect it, e.g. http://www.google.com/humans.txt

It's documented on http://humanstxt.org/Standard.html (scroll down).


[ robots.txt standard ]

There is a de facto standard, to be understood by crawler and bots visiting your site. See http://en.wikipedia.org/wiki/Robots_exclusion_standard


[ Is humans.txt an equivalent to robots.txt? ]

The robots.txt file is useful and needed by the community to tell the general search engines not to index some sensitive pages (e.g. deletion votes) or some specific engines not to crawl the site.

See http://en.wikipedia.org/robots.txt for the list

The humans.txt file contains arbitrary information, already offered and better updated on project pages. I mainly fear the humans.txt update is complicated or neglected.


[ Goal ]

You offer to "describe the page and invite contributors".

First, visitors shouldn't need to read humans.txt to be invited to contribute. It's rather dubious this outreach strategy would work.

Finally, you seem to have missed the goal of the humans.txt specification authors: they don't want as much to provide a text file to humans than to provide a file describing the humans behind the site (like a colophon).


[ Content if we follow the standard ]

Credits (team or thanks) would be too large to mention. There are too many active contributors. A generic mention like "See pages history" won't really be useful.

The tools used to build the sites are very various. Sure, we use MediaWiki. But then, every article editor has their favorite tools, every developer their favorite environment, there are the bots, the external tools, etc.


[ Technical implementation ]

The humans.txt file should also be considered for localization in every language we support. The solution offered should contain a l10n effort. This localization should be compatible with the standard, which is currently English only.

Note we have a robots.txt generation code from the MediaWiki:Robots.txt system message, so technically this is fairly simple to implement.


[ Next steps ]

Per the analysis I gave and the dubious benefits we could get, I recommend a WONTFIX resolution.

If you really want to go on this proposal, please submit a humans.txt sample content, so we'll have a basis for discussion. I will then give you some technical notes about how to generate this content if there are fields to automate. Then, you'll be able to launch a discussion with the community on en. or meta.
Comment 4 MZMcBride 2013-02-25 09:26:10 UTC
(In reply to comment #3)
> Note we have a robots.txt generation code from the MediaWiki:Robots.txt
> system message, so technically this is fairly simple to implement.

Trivial, even.

> If you really want to go on this proposal, please submit a humans.txt sample
> content, so we'll have a basis for discussion. I will then give you some
> technical notes about how to generate this content if there are fields to
> automate. Then, you'll be able to launch a discussion with the community on
> en. or meta.

I'm not sure a sample humans.txt file is needed. Project autonomy and sovereignty can probably guide us here. We could easily implement the ability to output a 404 error at /humans.txt unless a domain's [[MediaWiki:humans.txt]] page exists. For example, [[MediaWiki:humans.txt]] would control <https://en.wikipedia.org/humans.txt>. This approach would allow projects to decide for themselves whether to have a file like this (if no MediaWiki page exists --> no file exists) and what the file should contain if it's to exist, based on local community consensus.
Comment 5 Max Semenik 2013-02-25 09:29:08 UTC
Go to http://en.wikipedia.org

You will see:

Welcome to Wikipedia,
the free encyclopedia that anyone can edit.
4,172,968 articles in English

-that's the best description.
Comment 6 Dereckson 2013-02-25 09:30:58 UTC
MaxSem > This humans.txt goal isn't to describe a site.
Comment 7 MZMcBride 2013-02-25 09:33:07 UTC
Let's please not be so quick to shoot down bugs.
Comment 8 Andre Klapper 2013-02-25 15:01:45 UTC
(In reply to comment #0)
> http://en.wikipedia.org/wiki/Humans.txt is the equivalent to robots.txt, both
> are not defined by a standard.

1) Is there any real use? Who would access that file? robots.txt is at least read by crawlers, but this sounds just like creating yet another file with duplicated data for no good purpose.

2) What is the scope of who to list in that file? Feels extremely vague to me. We won't list all Wikipedia editors plus try to keep that updated, would we?
Comment 9 Mathias Schindler 2013-02-25 16:50:09 UTC
Okay, here is why I filed the feature request. I was doing some research in Wikipedia about robots.txt and I came across a see also section in the respective Wikipedia article that linked to the article on humans.txt.

There was little reasoning behind requesting a file for Wikimedia project sites except for "this looks nice and some projects seem to offer these files".

Shooting down the bug report will not hurt my feelings.
Comment 10 Daniel Friesen 2013-02-25 17:03:49 UTC
I really don't like the idea of humans.txt. /robots.txt made sense because a robot needs a fixed location to understand if it's even allowed to fetch any page on the website at all.

humans.txt however does not need that, it's just pollution of the website root path. And it doesn't even take into account separate sites on the same domain, etc...

Authorship, credits, acknowledgement about the people behind the site should be some form of metadata or content within the site. NOT something shoved in the web root.
Comment 11 Dereckson 2013-02-25 19:04:57 UTC
This is mainly a web agency thing. And we aren't a web agency.

Here the Humans TXT rationale:

"Because it's something simple and fast to create. Because it's not intrusive with the code. More often than not, the owners of the site don't like the authors signing it; they claim that doing so may make the site less efficient. By adding a txt file, you can prove your authorship (not your property) in an external, fast, easy and accessible way."

Well... okay but we have gazillons to people to credit, and the information is already accessible.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links