Last modified: 2014-11-14 03:44:06 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T58524, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 56524 - pywikibot transliteration should support chinese transliteration
pywikibot transliteration should support chinese transliteration
Status: REOPENED
Product: Pywikibot
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Lowest enhancement
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-02 23:28 UTC by [no longer active user]
Modified: 2014-11-14 03:44 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Python translation of https://github.com/axgle/pinyin (8.06 KB, text/x-python)
2014-08-31 06:28 UTC, zhuyifei1999
Details

Description [no longer active user] 2013-11-02 23:28:22 UTC
https://github.com/wikimedia/pywikibot-core/blob/master/pywikibot/userinterfaces/transliteration.py should support more scripts like Korean, Chinese or ml. jQuery.ime https://github.com/wikimedia/jquery.ime/tree/master/rules transliteration keyboards can be used for developing it like https://github.com/wikimedia/jquery.ime/blob/master/rules/ml/ml-transliteration.js

I am using output of code on my gadget http://www.wikidata.org/wiki/MediaWiki:Gadget-SimpleTransliterate.js (http://commons.wikimedia.org/wiki/File:Wikidata_Transliteration_Gadget.png) that is why I like it is be developed a little more
Comment 2 [no longer active user] 2013-11-03 21:03:44 UTC
I've added http://www.wikidata.org/w/index.php?title=MediaWiki%3AGadget-SimpleTransliterate.js&diff=83565215&oldid=83539622 Malayalam, Gurmukhi, Gujarati and Oriya for my gadget.
Comment 3 [no longer active user] 2013-11-14 11:09:06 UTC
This can be ported for Chinese support http://cpansearch.perl.org/src/KAWASAKI/Lingua-ZH-Romanize-Pinyin-0.23/
Comment 4 Gerrit Notification Bot 2013-11-22 17:15:52 UTC
Change 97040 had a related patch set uploaded by Ladsgroup:
Improving transliteration support

https://gerrit.wikimedia.org/r/97040
Comment 5 Gerrit Notification Bot 2013-11-22 17:19:53 UTC
Change 97044 had a related patch set uploaded by Ladsgroup:
Improving transliteration support

https://gerrit.wikimedia.org/r/97044
Comment 6 Gerrit Notification Bot 2013-11-25 20:23:51 UTC
Change 97040 merged by jenkins-bot:
Improving transliteration support

https://gerrit.wikimedia.org/r/97040
Comment 7 Gerrit Notification Bot 2013-11-25 20:27:40 UTC
Change 97044 merged by jenkins-bot:
Improving transliteration support

https://gerrit.wikimedia.org/r/97044
Comment 8 Amir Ladsgroup 2013-11-26 06:48:04 UTC
Both of patches got merged.
Comment 9 [no longer active user] 2013-11-26 07:23:02 UTC
Reopened for Chinese transliteration
Comment 10 Amir Ladsgroup 2013-11-26 09:40:46 UTC
Can you give me list of Chinese characters that needed to be added to this list?
Comment 11 Andre Klapper 2013-11-26 10:24:09 UTC
(For future reference, defining the exact scripts to be supported in a bug request is welcome. If it's just about "support more" than a report can easily get unfixable by comments broadening the scope of a bug report.)
Comment 12 [no longer active user] 2013-11-26 10:43:28 UTC
#c3
Comment 13 Amir Ladsgroup 2013-11-26 10:51:18 UTC
I checked that source but I couldn't find the dictionary file, [1] syas there is file named CTLauBig5.tit, but there isn't. Can you tell me more precise about the dictionary?

[1] http://cpansearch.perl.org/src/KAWASAKI/Lingua-ZH-Romanize-Pinyin-0.23/lib/Lingua/ZH/Romanize/DictZH.pm
Comment 14 [no longer active user] 2013-11-26 11:02:12 UTC
There is not a one-to-one "dictionary" there, that is why I CCd original writer of the transliteration. Also have a look at https://github.com/axgle/pinyin
Comment 15 zhuyifei1999 2014-08-31 06:28:27 UTC
Created attachment 16327 [details]
Python translation of https://github.com/axgle/pinyin

(In reply to [no longer active user] from comment #14)
> There is not a one-to-one "dictionary" there, that is why I CCd original
> writer of the transliteration. Also have a look at
> https://github.com/axgle/pinyin

{{done}} translation of the four scripts to python. See attachment.
Comment 16 zhuyifei1999 2014-08-31 10:31:19 UTC
> Created attachment 16327 [details]
> Python translation of https://github.com/axgle/pinyin

However there is some bug that caused 6651 Chinese characters getting 'zuo'.
Comment 17 Merlijn van Deen (test) 2014-08-31 11:01:34 UTC
I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.
Comment 18 Gerrit Notification Bot 2014-08-31 11:49:45 UTC
Change 157498 had a related patch set uploaded by Zhuyifei1999:
Improving transliteration support for Chinese

https://gerrit.wikimedia.org/r/157498
Comment 19 zhuyifei1999 2014-08-31 12:18:34 UTC
(In reply to Merlijn van Deen from comment #17)
> I suppose we can add this, but what's the intended use case? We support full
> unicode console output (and input, but transliteration is output-only) on
> all systems.

Yes, that is indeed a hard question. Why do we still have transliteration.py?
Comment 20 Gerrit Notification Bot 2014-09-01 10:32:15 UTC
Change 157498 merged by jenkins-bot:
Improving transliteration support for Chinese

https://gerrit.wikimedia.org/r/157498
Comment 21 John Mark Vandenberg 2014-10-07 08:14:39 UTC
Reopen if there is more to be done.
Comment 22 zhuyifei1999 2014-10-07 08:37:57 UTC
(In reply to John Mark Vandenberg from comment #21)
> Reopen if there is more to be done.

zh-hant (Traditional Chinese) needed.
Comment 23 Merlijn van Deen (test) 2014-10-07 09:10:10 UTC
I will repeat my question:

> I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

/why/ is it needed?
Comment 24 zhuyifei1999 2014-10-18 15:43:59 UTC
(In reply to Merlijn van Deen from comment #23)
> I will repeat my question:
> 
> > I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.
> 
> /why/ is it needed?

I suppose, that when the output is somehow ASCII-limited (for some reason the log files by the grid engine on tool labs is an example of this), transliterated output could be more useful than a pile of question marks or other non-readable code.
Comment 25 zhuyifei1999 2014-11-13 13:14:33 UTC
Some transliteration data exist at https://www.wikidata.org/wiki/MediaWiki:Gadget-SimpleTransliterate.js

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links