Last modified: 2012-03-24 12:17:57 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T32611, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 30611 - WikiEditor localization for Arabic Wikipedia
WikiEditor localization for Arabic Wikipedia
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
WikiEditor (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Amir E. Aharoni
: i18n, patch, patch-reviewed
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-29 11:35 UTC by Zack
Modified: 2012-03-24 12:17 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
reordered Arabic and added Arabic extended (6.24 KB, patch)
2011-08-30 13:41 UTC, Amir E. Aharoni
Details
Arabic screenshot, English UI (129.87 KB, image/png)
2011-08-30 16:44 UTC, Siebrand Mazeland
Details
Arabic extended screenshot, English UI (138.08 KB, image/png)
2011-08-30 16:44 UTC, Siebrand Mazeland
Details
an idea for a function with a dotted circle (1.27 KB, patch)
2011-08-31 17:01 UTC, Amir E. Aharoni
Details
better implementation of the dottedCircleWithDiacritic function (6.97 KB, patch)
2011-08-31 20:22 UTC, Amir E. Aharoni
Details

Description Zack 2011-08-29 11:35:56 UTC
Hi. I would like to get some information regarding the Editbar and how to customize it for Arabic Wikipedia usage. It's merely translated and we still need to work on localization, namely reorganizing the Special characters section, and replacing icons presented in Latin symbols with Arabic ones - that is B for bold, etc.
Comment 1 Roan Kattouw 2011-08-29 12:03:59 UTC
For a guide to creating localized toolbar icons, see https://secure.wikimedia.org/wikipedia/usability/wiki/Text_format_icons . Once you've created some, post a link here and we'll put them in.

As for reorganizing the special characters section -- why would you want to do that? I don't believe any other language community asked us to.
Comment 2 Zack 2011-08-29 12:34:03 UTC
Thanks..

Apart from that, I'm requesting reprioritizing this section because any ar-WP editor would evidently prefer for the Arabic part to be on top of the section, and not the Latin part. I also wish if we could add diacritic symbols (harakat) wish are more likely to be used that the five additional letters i.e. p, ch, zh, g, and ng respectively.
Comment 3 Zack 2011-08-29 12:35:48 UTC
correct * which...
Comment 4 Roan Kattouw 2011-08-29 12:36:33 UTC
Adding additional characters to the special characters section is easy, just tell me which ones you want added (I'll need you to give me the literal characters). I'm not sure about reprioritization, I'd have to look at the code.
Comment 5 Amir E. Aharoni 2011-08-29 12:57:32 UTC
I suppose that by "p, ch, zh, g, and ng", you refer to letters which are not used in Arabic itself, but in Persian.

Currently, these are the last letters in the Arabic section of jquery.wikiEditor.toolbar.config.js: "\u067e", "\u0686", "\u0698", "\u06af", "\u06ad".

I see no reason to put them in front of the regular Arabic letters. Rather, a new section can be created for them. The precedent for it is that there are three groups for Latin:
* "Latin", which includes the most common special characters of European languages;
* "Latin extended", which includes the more exotic characters for languages such Vietnamese;
* the super-exotic "IPA".

There are many more special Arabic characters than "p, ch, zh, g, and ng". There are special characters for Urdu, Pashto, Sindhi etc. All of them can go to an "Arabic Extended" section, which would be useful for people who have a regular Arabic keyboard and need to type names in Urdu, Persian etc.
Comment 6 Zack 2011-08-29 13:17:20 UTC
@Amir
I didn't say we should put them in front; I said diacritics are used more that's why we should add them. I like your idea about the new Arabic extended section, I hope to see it working sometime soon.

@Roan
Characters with Hex. NCR as follows:
َ - َ 
ُ - ُ
ِ - ِ
ً - ً
ٌ - ٌ
ٍ - ٍ
ْ - ْ
ّ - ّ

Please consider this more appropriate ordering:
ابتثجحخدذرزسشصضطظعغفقكلمنهوي ءآأؤإئىة َ ُ ِ ً ٌ ٍ ْ ّ ،؛؟ پچژڤڭگ
Comment 7 Roan Kattouw 2011-08-29 18:55:07 UTC
(In reply to comment #6)
> @Roan
> Characters with Hex. NCR as follows:
> َ - َ 
> ُ - ُ
> ِ - ِ
> ً - ً
> ٌ - ٌ
> ٍ - ٍ
> ْ - ْ
> ّ - ّ
> 
> Please consider this more appropriate ordering:
> ابتثجحخدذرزسشصضطظعغفقكلمنهوي ءآأؤإئىة َ ُ ِ ً ٌ ٍ ْ ّ ،؛؟ پچژڤڭگ
Amir, since you've inserted yourself into this discussion (thank you for reading my mind), could you write up a patch for this?
Comment 8 Amir E. Aharoni 2011-08-30 13:41:18 UTC
Created attachment 8987 [details]
reordered Arabic and added Arabic extended

Split the Arabic section in jquery.wikiEditor.toolbar.config.js into Arabic and Arabic extended. In Arabic i put the core 28-letter alphabet, special letters for the Arabic language, vowels, punctuation and digits. In "Arabic extended" i put most of the other letters and signs that are used by languages such as Arabic, Urdu, Balochi etc.

I added the message 'wikieditor-toolbar-characters-page-arabicextended'.

In the character arrays i added comments that group characters by similarity to a basic Arabic letter, to make maintenance easier. I hope that it's OK.
Comment 9 Zack 2011-08-30 16:06:21 UTC
That's great Amir. Thank you both very much. I'll see how it goes and keep you in touch.
Comment 10 Siebrand Mazeland 2011-08-30 16:29:45 UTC
Attachment 8987 [details] from comment 8 was applied in r95790. You can test it at https://translatewiki.net where it will go live in a few seconds. Please let us know if this is not done as expected. I will tag the revision for backporting to 1.18 and 1.17wmf.
Comment 11 Siebrand Mazeland 2011-08-30 16:44:19 UTC
Created attachment 8988 [details]
Arabic screenshot, English UI

I've updated the code and taken a look at the result. What I see are a few empty character cells. If I click them, something *is* inserted in my edit windows. Is this a font issue on my side, or a more generic issue?

OS: OSX 10.7, Firefox 6.
Comment 12 Siebrand Mazeland 2011-08-30 16:44:40 UTC
Created attachment 8989 [details]
Arabic extended screenshot, English UI
Comment 13 Zack 2011-08-30 17:01:58 UTC
Empty cells must be diacritics. If they aren't visible, try using the Dotted Circle character (◌) as a carrier; it worked for Hebrew niqqud..

List of Ext. Arabic: http://people.w3.org/rishida/scripts/pickers/arabic-block/
Comment 14 Amir E. Aharoni 2011-08-31 17:01:52 UTC
Created attachment 8994 [details]
an idea for a function with a dotted circle

Zack, thank you for making me notice the dotted circle use in Hebrew. In fact, it's not implemented correctly for Hebrew. The current code for Hebrew says [ "\u05b0\u25cc", "\u05b0" ], where \u05b0 is a vowel diacritic ("niqqud") and \u25cc is the dotted circle. The dotted circle is supposed to come before the diacritic sign, not after it.

The dotted circle is used a lot in the Hebrew section (incorrectly). It is also used (correctly) in several sections for Indian languages, such as Sinhala and Gujarati. And it will be useful for Arabic and more languages. Instead of repeating it all the time, maybe it can be factored out to a function?

I wrote this proof of concept function and i am attaching it as a patch. It's only for testing, not for committing. I'm not much of a JS guru - i didn't know what would be the best place to put it, so i just put it in the beginning of the file. It works for me, but i've got a hunch that there's a better location for it.

Its logic can also be more clever - for example, it can take an array of characters and return all the needed diacritics at once.
Comment 15 Siebrand Mazeland 2011-08-31 17:16:43 UTC
Re comment 14: Amir, the Arabic Extended option has already been added. I think your patch may be based on another version than trunk?
Comment 16 Siebrand Mazeland 2011-08-31 17:24:06 UTC
Some quick notes of review I requested on IRC:
RoanKattouw:
+                // The core 28-letter alphabet, special letters for the Arabic language,
[5:15p] RoanKattouw: Use tabs not spaces for indentation
[5:15p] RoanKattouw: "\u0627", "\u0628",	"\u062a",
[5:15p] Krinkle: I'd recommend using [diacritic, dottedCircle(diacritic)]
[5:15p] RoanKattouw: Random tab in the middle of a line
[5:15p] Krinkle: eh, the other way around of course
[5:15p] Krinkle: ie. not let it return an array
[5:16p] Krinkle: To be more flexible. Otherwise rename the function
[5:16p] RoanKattouw: siebrand: Patch looks fine otherwise
Comment 17 Amir E. Aharoni 2011-08-31 20:22:35 UTC
Created attachment 9000 [details]
better implementation of the dottedCircleWithDiacritic function

(See comment 14 for the general description.)

Description:
1. Created dottedCircleWithDiacritic function in the closure.

2. Changed arabic, arabic extended and hebrew sections to work with the function.

Other comments:
1. If this is fine, other sections that use \u25cc should use this function, too. Currently it's sinhala and gujarati, and it's useful for more languages.

2. Is there a nice way to program this function to accept an array of characters and return an array of sequences, so it won't have to be repeated so many times?
Comment 18 Zack 2011-09-10 15:06:20 UTC
(In reply to comment #4)
> characters). I'm not sure about reprioritization, I'd have to look at the code.
Any good news? Or else, is it possible to display the Arabic section by default, similarly to the use of # in wikilinks?
Comment 19 Niklas Laxström 2011-09-10 15:26:20 UTC
(In reply to comment #17)
> 2. Is there a nice way to program this function to accept an array of
> characters and return an array of sequences, so it won't have to be repeated so
> many times?

Yes, but as far as I know it is not possible to do [1, 2, bar(3, 4), 5] and have [1, 2, 3x, 4x, 5] as the output instead of [1, 2, [3x, 4x], 5]. Perhaps some kind of special marker and then post processing the table?
Comment 20 Zack 2011-09-10 15:32:12 UTC
(In reply to comment #17)

Some characters appear as some sort of indistinguishable dashed circles (e.g. ؠ) while they should be rendered as hex code rectangular boxes (e.g. ݹ)..
Comment 21 Amir E. Aharoni 2012-03-20 18:24:39 UTC
Hi,

What remains to be done here?

1. The broken characters can currently be fixed by installing a good font (for example http://www.amirifont.org/ ) and using a browser that support fallback fonts well. We are already working on making this font automatically available as a web font, but until we fix some issues, it must be installed manually.

2. The vowels are inserted correctly according to my tests.

3. If you want to customize the icons such as bold, italic, etc., let us know which icons do you want to use according to Roan's comment #1.

Anything else?
Comment 22 Sumana Harihareswara 2012-03-20 18:31:25 UTC
marking patch reviewed.
Comment 23 Zack 2012-03-20 22:04:40 UTC
Well, in the Arabic extended section, Windows renders characters in different fonts; I have characters appearing in Courier New and Amiri.

Another thing, is it possible to include a zero-width joiner for example, in the '''bold'''/''italic'' notation in order to join Arabic characters in bold/italic with those in normal font weight?
Comment 24 Amir E. Aharoni 2012-03-20 22:08:18 UTC
(In reply to comment #23)
> Well, in the Arabic extended section, Windows renders characters in different
> fonts; I have characters appearing in Courier New and Amiri.

Yes, currently fonts may get mixed up even if you have a good font installed. We are working on it separately.

> Another thing, is it possible to include a zero-width joiner for example, in
> the '''bold'''/''italic'' notation in order to join Arabic characters in
> bold/italic with those in normal font weight?

It's possible to add ZWJ and ZWNJ, although i didn't understand how is related to bold/italic.
Comment 26 Amir E. Aharoni 2012-03-24 10:57:57 UTC
Added ZWJ and ZWNJ in https://gerrit.wikimedia.org/r/#change,3681 .

Anything else, or can i close this?

(The fonts problem will be solved separately.)
Comment 27 Zack 2012-03-24 11:36:02 UTC
It's okey now. Many thanks.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links