Last modified: 2014-10-29 06:31:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57016, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55016 - Extended version information in user-agent
Extended version information in user-agent
Status: NEW
Product: Pywikibot
Classification: Unclassified
General (Other open bugs)
core-(2.0)
All All
: Normal enhancement
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-05 04:04 UTC by Kunal Mehta (Legoktm)
Modified: 2014-10-29 06:31 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 04:04:33 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/feature-requests/330/
Reported by: valhallasw
Created on: 2013-02-04 20:52:53
Subject: Extended version information in user-agent
Original description:
See the discussion at https://www.mediawiki.org/wiki/Special:Code/pywikipedia/11027\#c33303

Implementation notes:

Hash of a file:
>>> import hashlib
>>> m = hashlib.sha1\(\)
>>> m.hexdigest\(\)
'93ae86148e74a7c3a3d63f7810b48c51889fba46'

Classes used in stack trace:

>> import inspect
>> \[\(x.\_\_module\_\_, x.\_\_name\_\_\) for x in \(s\[0\].f\_locals.get\('self', None\).\_\_class\_\_ for s in inspect.stack\(\)\)\]

Example result:
\[\('wikipedia\_family', 'Family'\), \('pdb', 'Pdb'\), \('pdb', 'Pdb'\), \('pdb', 'Pdb'\), \('pdb', 'Pdb'\), \('pdb', 'Pdb'\), \('pdb', 'Pdb'\), \('pdb', 'Pdb'\), \('pdb', 'Pdb'\), \('wikipedia\_family', 'Family'\), \('wikipedia', 'Site'\), \('wikipedia', 'Site'\), \('wikipedia', 'Site'\), \('wikipedia', 'Site'\), \('wikipedia', 'Site'\), \('wikipedia', 'Page'\), \('wikipedia', 'Page'\), \('\_\_main\_\_', 'Subject'\), \('\_\_main\_\_', 'Subject'\), \('\_\_main\_\_', 'InterwikiBot'\), \('\_\_main\_\_', 'InterwikiBot'\), \('\_\_builtin\_\_', 'NoneType'\), \('\_\_builtin\_\_', 'NoneType'\), \('\_\_builtin\_\_', 'NoneType'\), \('pdb', 'Pdb'\), \('pdb', 'Pdb'\), \('\_\_builtin\_\_', 'NoneType'\), \('\_\_builtin\_\_', 'NoneType'\), \('\_\_builtin\_\_', 'NoneType'\), \('\_\_builtin\_\_', 'NoneType'\)\]
Comment 1 John Mark Vandenberg 2014-08-07 07:08:48 UTC
In July there were three threads of discussion about user-agents for bots.

http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/78356

http://lists.wikimedia.org/pipermail/pywikipedia-l/2014-July/008924.html

http://lists.wikimedia.org/pipermail/pywikipedia-l/2014-July/008932.html

Following that Amir did some work to allow customisation of the user-agent, specifically adding site based information (https://gerrit.wikimedia.org/r/#/c/147381/), and I've put up a patch to allow the user agent functionality to be usable in more circumstances and be tested more easily. https://gerrit.wikimedia.org/r/#/c/152200/

As it is now a customisable string, it is possible to add email addresses, links to bot approvals, etc, etc.  And lightly documented at

https://www.mediawiki.org/wiki/Manual:Pywikibot/User-agent

During the discussions I suggested something like what this bug is about:  Identifying what code is running, is it the 'maintainer' version or customised by the bot operator, and putting that in the useragent.

http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/78363
http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/78413

In an IRC discussion with valhallasw, I suggested that we include the contact details of the maintainers of the running script, which can be parsed from the script docstring.

So, currently pywikibot _has_ the commit hash, sequential pywikibot revision, and 

It only puts the sequential pywikibot revision into the user-agent, in the variable {version}.  The sequential pywikibot revision is only a good reference point, but the running code could be different.

The commit hash is (almost) useless for ops staff, as it frequently changes.

The $Rev$ for each file that is checked in is more granular, but that doesnt help if the script file is modified or isnt checked in.

If I understand this enhancement request, it is suggesting that we get a hash of some/all of module that are used by the running script, and include that in the user-agent.

IMO, the first step is to get a hash for the script/module executed on the command line.  This hash will change less frequently, and will often be common even for different branches of pywikibot.  If the file is unmodified, I suggest we keep the existing user-agent value for {script}/{version}, which has a version prefix of 'g' and 's' for git or subversion.  If the file is modified, I suggest we put the file hash in {version}, with a different prefix - e.g. 'm69789e1' where 'm' is for 'modified'.
Comment 2 Gerrit Notification Bot 2014-08-07 22:36:17 UTC
Change 152200 had a related patch set uploaded by John Vandenberg:
User-agent graceful degradation

https://gerrit.wikimedia.org/r/152200
Comment 3 Gerrit Notification Bot 2014-09-03 21:11:09 UTC
Change 152200 merged by jenkins-bot:
User-agent graceful degradation

https://gerrit.wikimedia.org/r/152200
Comment 4 Sorawee Porncharoenwase 2014-10-29 06:16:19 UTC
@John Mark Vandenberg: Is the bug fixed?
Comment 5 John Mark Vandenberg 2014-10-29 06:31:24 UTC
Not yet.  We would like to add maintainer details to the user agent, where the maintainer can be different for each script, and would be obtained via a module variable __maintainer__ or similar.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links