Last modified: 2014-04-17 17:59:29 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57109, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55109 - Multilingual development
Multilingual development
Status: NEW
Product: Pywikibot
Classification: Unclassified
General (Other open bugs)
core-(2.0)
All All
: Low enhancement
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-05 04:25 UTC by Kunal Mehta (Legoktm)
Modified: 2014-04-17 17:59 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 04:25:17 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/feature-requests/101/
Reported by: Anonymous user
Created on: 2007-08-06 17:47:39
Subject: Multilingual development
Original description:
English speaking at end

==Francais== Texte d'origine
Bonjour
Je voudrais vous proposer un système qui permet de rendre le robot multilingue. En effet, tous les messages envoyé à la console sont anglophone. Or le but d'un robot est de s'adapter à la multitude des languages pouvant exister de la part des utilisateurs. C'est pour cela que je vous propose le système suivant :

Création d'un nouveau répertoire 'lang'. Dans ce répertoire s'y trouverait des fichiers de type XX.py \(XX étant le code ISO 639 de la langue\). Donc ce répertoire contiendra 1 ficher par code de langue existant.

Lorsque les différents programmes veulent afficher un message sur la console, la commande utilisé est très souvent 'wikipedia.output' ou 'wikipedia.input'. Le travail de cette commande serait d'appeller le fichier xx.py avec le numéro du message à renvoyer en paramètre, le choix du xx serait donnée par la variable mylang de user-config.py. le fichier xx.py enverrais alors le message à afficher en tenant compte des différentes variables de type %s \(ou autre\) bien entendu

Exemple :
dans user-config.py, j'ai "mylang = 'fr'"
Replace.py à la ligne 375 contient la commande "wikipedia.input\(u'Please enter the new text:'\)",

Le nouveau système coderait "wikipedia.input\(u'Please enter the new text:'\)" par "wikipedia.input.message\(284\)"
appelerait donc lang/fr.py et lui demanderais de lui retourner le message n° 284 qui serait "s'il vous plais, entrez le nouveaux texte :" et le lui retourne.

Voila, en esperant avoir compris ma demande. 

Je vous remercie de votre écoute

==English== Text translates since French by a machine translation system

Hello
I would like to propose you a system which allows to return the multilingual robot. Indeed, all the messages messenger in the console are English-speaking. Now the purpose of a robot is to adapt itself to the multitude of the languages which can exist on behalf of the users. It is for it that I propose you the following system:

Creation of a new directory ' lang '. In this directory would be files of type XX.py \(XX there being the code ISO 639 of the language\). Thus this directory will contain 1 file by existing code of language.

When the various programs want to post a message on the console, the order used is very often ' wikipedia.output ' or ' wikipedia.input '. The work of this order would be to call the xx.py file with the number of the message to be sent back in parameter, the choice of the xx would be given by the mylang variable to user-config.py

The xx.py file would send then the message to be posted\(shown\) by taking into account various variables of type %s \(or other\) naturally 

Example:
In user-config.py, I have " mylang = ' fr ' "
Replace.py in the line 375 contains the command " wikipedia.input \(u' Please enter the new text: '\) ", 
The new system would code " wikipedia.input \(u' Please enter the new text: '\) "by" wikipedia.input.message \(284\) " call thus lang/fr.py and would ask it to return it the message n° 284 which would be " s'il vous plais, entrez le nouveaux texte : ".

Here we are, by hoping to have understood my demand.  
I thank you for your listening
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 04:25:19 UTC
Logged In: YES 
user\_id=880694
Originator: NO

Your approach works for static strings like wikipedia.input\(u'Please enter the new text: '\), but it doesn't work for dynamic ones like output\(u'Page %s moved to %s' % \(self.title\(\), newtitle\)\).

Instead, we could use gettext: http://docs.python.org/lib/module-gettext.html

What do you think?
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 04:25:21 UTC
- **labels**: 745454 -->
Comment 3 Kunal Mehta (Legoktm) 2013-10-05 04:25:23 UTC
Logged In: YES 
user\_id=687283
Originator: NO

I have no idea if gettext is usable on non-unix platforms, and whether it allows to use locally saved translations. If it does, I think we might give it a try. We would need to change all 'text %s %s' % \(self.title\(\), newtitle\) commands to their dictionary relatives, 'text %\(title\)s %\(newtitle\)s' % \{'title': self.title\(\), 'newtitle': newtitle\). Not a small job, but doable, I suppose.

If gettext is not \(easily\) available on win32, we might write our own system. I do not like the idea of using integers though ;\)
Comment 4 Kunal Mehta (Legoktm) 2013-10-05 04:25:25 UTC
Logged In: YES 
user\_id=855050
Originator: NO

A good idea; however, it should be implemented using the standard Python 'gettext' module, rather than reinventing the wheel.
Comment 5 Kunal Mehta (Legoktm) 2013-10-05 04:25:27 UTC
Logged In: YES 
user\_id=880694
Originator: NO

Current status:

I have changed the text colorization system so that it is now possible to internationalize colorized strings.

There is now a branch called i18n which has gettext support. It works fine, and the selflink.py script can be run in German. There is a new config variable to set the UI lang, it will default to mylang and fall back to English if the chosen language is unsupported.

One disadvantage is that source code readability suffers. Before:

choice = wikipedia.inputChoice\(u'\nWhat shall be done with this selflink?',  \['unlink', 'make bold', 'skip', 'edit', 'more context'\], \['U', 'b', 's', 'e', 'm'\], 'u'\)

After:

choice = wikipedia.inputChoice\(\_\(u'\nWhat shall be done with this selflink?'\),  \[\_\('unlink'\), \_\('make bold'\), \_\('skip'\), \_\('edit'\), \_\('more context'\)\], \[\_\('u \[unlink hotkey\]'\), \_\('b \[make bold hotkey\]'\), \_\('s \[skip hotkey\]'\), \_\('e \[edit hotkey\]'\), \_\('m \[more context hotkey\]'\)\], \_\('u \[unlink hotkey\]'\)\)

valhallasw and I have discussed how to modify inputChoice\(\) to make it less cluttered, but we didn't find a convincing solution. For example, inputChoice could be changed so that it works like this:

choice = wikipedia.inputChoice\(\_\(u'What shall be done with this selflink?'\), \_\(u'\[u\]nlink, make \[b\]old, \[s\]kip, \[e\]dit, \[m\]ore context'\), 0\)

so that it automatically parses the \[brackets\] to find out the hotkeys, and returns an integer for the option that was chosen. But integers make the code very hard to read and maintain when there are long if-elif constructions \(... elif choice == 12 ...\).

So, we currently don't know how to do it better. I think having full i18n is worth cluttering up the code a little bit.
Comment 6 Kunal Mehta (Legoktm) 2013-10-05 04:25:28 UTC
- **priority**: 5 --> 7
Comment 7 Kunal Mehta (Legoktm) 2013-10-05 04:25:30 UTC
Logged In: YES 
user\_id=687283
Originator: NO

Take a look at my proposal at http://pywiki.pastey.net/71924 . Readability still is not too good; another way would be to create a new input function and let gettext see it as translatable strings. No idea if gettext can handle multiple parameters, but I assume it can handle at least two ;\)
Comment 8 Kunal Mehta (Legoktm) 2013-10-05 04:25:32 UTC
Logged In: YES 
user\_id=687283
Originator: NO

Check http://svn.wikimedia.org/viewvc/pywikipedia/branches/pywikipedia/i18n/input\_choice\_proposal/ and see if you like it. I wrote a new function; the system is as follows:

retval = i18nChoice\("Do you want to save?", "\[\('yes', 'y'\), \('no', 'n'\)\]", 'yes'\)

instead of a key, the options \*name\* is given as parameter, and returned.

This returns, with a dutch translation:
Wilt u opslaan? \(\[j\]a, \[n\]ee\) n

then retval == 'no'

To use this, we need to use xgettext:
xgettext --keyword=i18nChoice:1,2 ic.py

and we get plural defenitions:
msgid "Do you want to save?"
msgid\_plural "\[\('yes', 'y'\), \('no', 'n'\)\]"
msgstr\[0\] "Wilt u opslaan?"
msgstr\[1\] "\[\('ja', 'j'\), \('nee', 'n'\)\]"

Maybe not the nicest solution, but the best one I could find. The alternative would be
retval = i18nChoice\(\_\("Do you want to save?"\), "\[\('yes', 'y'\), \('no', 'n'\)\]", 'yes'\)
with
xgettext --keyword=i18nChoice,2 ic.py
\(and the corresponding changes in the function, of course.\)
Comment 9 Kunal Mehta (Legoktm) 2013-10-05 04:25:34 UTC
Logged In: YES 
user\_id=687283
Originator: NO

Update: Because of the limitations of the normal xgettext implementation, I am writing my own version, using the python compiler package. With some luck, it will be possible to maintain the original inputChoice format this way \(as there is no reason to use a string anymore; the string-to-translate can be generated from a function parameter that is a list, or a dict, or .... \).

When this implementation is done, only three functions \(wikipedia.input, wikipedia.output and wikipedia.inputChoice\) need to be adapted \(and possibly we need to change wikipedia.Error to translate the error\).
Comment 10 Kunal Mehta (Legoktm) 2013-10-05 04:25:35 UTC
Something new with the dev meeting in Berlin?
Comment 11 Kunal Mehta (Legoktm) 2013-10-05 04:25:37 UTC
Nice to have but not for a high priority. Possibly for the rewrite.
Comment 12 Kunal Mehta (Legoktm) 2013-10-05 04:25:39 UTC
- **labels**:  --> rewrite
- **priority**: 7 --> 2
Comment 13 Ricordisamoa 2014-04-16 07:19:17 UTC
Now we have pywikibot/i18n.
Actually, it is only used for things that appear on wikis themselves (e.g. edit summaries for scripts). Should we use it for log messages too?
Comment 14 xqt 2014-04-16 07:32:04 UTC
Maybe. We have i18n.input() for example and i18n/pywikibot.py for more generic messages. On the other hand log files are good for debugging and therefore there are good reasons that log files should be readably for developers i.e. written in english.
Comment 15 pyfisch 2014-04-17 17:59:29 UTC
I think that there are far way more important things about Pywikibot than providing internationalized output and input. Most not end user tools are only English. To translate Pywikibot in around 10 languages we would need many translators we do not have, also wiki pages with translation engine are not always translated and many of them are more important.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links