Last modified: 2013-07-06 14:56:51 UTC
Hello, Word-ending links does not work for or.wikipedia. Its only working for English characters ex. [[ଓଡିଆ]]n but not for Odia ex. [[ଓଡିଆ]]ପ. Also on hn.wikipedia for hindi chars [[चेन्नई]]क [[चेन्नई]]अ [[चेन्नई]]कि ....
Sorry, its hi.wikipedia not hn.wiki.
I was able to fix this on my local MediaWiki installation. I have tried for _hours_ now to get git/gerrit properly set up to make a patch or whatever, but get a cavalcade of various error messages and am now officially giving up. The fixes I have made are these: In languages/messages/MessagesHi.php, change Line 176 to this: $linkTrail = "/^([a-zऀ-ॿ]+)(.*)$/sDu"; In languages/messages/MessagesOr.php, add this below Line 33: $linkTrail = "/^([a-zଁ-୷]+)(.*)$/sDu"; These fixes may not account for punctuation and/or other symbols that should not be included in the linktrail. For Hindi I used the characters in the standard Devanagari Unicode block, and the same for Oriya. I am not sure which symbols are punctuation, so this may not be correct. But it's better than nothing, I think (hope).
Clarified summary. As a coincidence, I've just updated docs on linktrail at [[m:Help:Links]] and [[mw:Help:Links#linktrail]], where did you/would you look for this sort of info to understand how it works and how it can be localised? This feature is very poorly documented...
(In reply to comment #3) > Clarified summary. > > As a coincidence, I've just updated docs on linktrail at [[m:Help:Links]] and > [[mw:Help:Links#linktrail]], where did you/would you look for this sort of info > to understand how it works and how it can be localised? This feature is very > poorly documented... It is a line in every languages/messages/MessagesXx.php file, starting with $linkTrail. It is defined as a simple regex, like in the fixes I posted above. For unicode languages, you have to add the minuscule "u" at the end of the regex.
(In reply to comment #4) > It is a line in every languages/messages/MessagesXx.php file, starting with > $linkTrail. It is defined as a simple regex, like in the fixes I posted above. > For unicode languages, you have to add the minuscule "u" at the end of the > regex. Yes, I know this, but I'd like ansuman to give us some suggestions as where to put such info so that users find it.
(In reply to comment #2) > I was able to fix this on my local MediaWiki installation. I have tried for > _hours_ now to get git/gerrit properly set up to make a patch or whatever, but > get a cavalcade of various error messages and am now officially giving up. > > The fixes I have made are these: > > In languages/messages/MessagesHi.php, change Line 176 to this: > > $linkTrail = "/^([a-zऀ-ॿ]+)(.*)$/sDu"; > > In languages/messages/MessagesOr.php, add this below Line 33: > > $linkTrail = "/^([a-zଁ-୷]+)(.*)$/sDu"; > > > These fixes may not account for punctuation and/or other symbols that should > not be included in the linktrail. For Hindi I used the characters in the > standard Devanagari Unicode block, and the same for Oriya. I am not sure which > symbols are punctuation, so this may not be correct. But it's better than > nothing, I think (hope). These fixes did not work for me, is there anything else that needs to be done? (In reply to comment #5) > (In reply to comment #4) > > It is a line in every languages/messages/MessagesXx.php file, starting with > > $linkTrail. It is defined as a simple regex, like in the fixes I posted above. > > For unicode languages, you have to add the minuscule "u" at the end of the > > regex. > > Yes, I know this, but I'd like ansuman to give us some suggestions as where to > put such info so that users find it. Niklas tells its intended to keep it so and make people request so that we don't break things.
So I guess this needs a volunteer to take the fix from comment 2 and put it into Gerrit. See http://www.mediawiki.org/wiki/Developer_access for anybody interested.
(In reply to comment #5) > Yes, I know this, but I'd like ansuman to give us some suggestions as where > to > put such info so that users find it. I am not sure how you want me to give suggestions? Let me elaborate what I think you mean. Its simple, As we write "verb" "adverb" or "preposition" with Nouns in Odia, e.g. [[ଓଡ଼ିଆ]]ରେ , [[ଭାଷା]]ଗୁଡିକୁ . Often we write them together as one word. And the later part doesn't get linked to first part and it looks odd as it is seen in two different colors. Is this what you wanted to know Nemo ?
Related URL: https://gerrit.wikimedia.org/r/65653 (Gerrit Change Ib1b233d227f33e77c212e67eee2aea64357e55ba)
(In reply to comment #9) > Related URL: https://gerrit.wikimedia.org/r/65653 (Gerrit Change > Ib1b233d227f33e77c212e67eee2aea64357e55ba) The patch adds to linktrail *all* the characters listed in: http://www.unicode.org/charts/PDF/U0900.pdf http://www.unicode.org/charts/PDF/UA8E0.pdf This includes, for instance, "। DEVANAGARI DANDA" and "॥ DEVANAGARI DOUBLE DANDA" ("Generic punctuation for scripts of India"). ansuman, is it ok? Please check those two PDF, otherwise we'll assume it's fine.
(In reply to comment #10) > This includes, for instance, "। DEVANAGARI DANDA" and "॥ DEVANAGARI DOUBLE > DANDA" ("Generic punctuation for scripts of India"). Excluded danda characters in latest patchset.
Hi, Apologies for late response. Yes it's working in Odia Wikipedia now, I haven't checked all the characters and in other languages though. Thanks a lot Nemo, Santhosh T., Srikanth L., Jon, Andre. :) (In reply to comment #10) > (In reply to comment #9) > > Related URL: https://gerrit.wikimedia.org/r/65653 (Gerrit Change > > Ib1b233d227f33e77c212e67eee2aea64357e55ba) > > The patch adds to linktrail *all* the characters listed in: > http://www.unicode.org/charts/PDF/U0900.pdf > http://www.unicode.org/charts/PDF/UA8E0.pdf > This includes, for instance, "। DEVANAGARI DANDA" and "॥ DEVANAGARI DOUBLE > DANDA" ("Generic punctuation for scripts of India"). > > ansuman, is it ok? Please check those two PDF, otherwise we'll assume it's > fine. Those PDF contain only Devanagari script, anyway it's working for Odia Wikipedia. Thanks.