Last modified: 2014-10-08 14:46:49 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T59628, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 57628 - VisualEditor: Get rid of ve.splitClusters
VisualEditor: Get rid of ve.splitClusters
Status: RESOLVED FIXED
Product: VisualEditor
Classification: Unclassified
Technical Debt (Other open bugs)
unspecified
All All
: Normal enhancement
: VE-deploy-2014-10-09
Assigned To: Sucheta Ghoshal
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-26 22:59 UTC by Roan Kattouw
Modified: 2014-10-08 14:46 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Roan Kattouw 2013-11-26 22:59:13 UTC
It has a comment that says TODO: strip out calls to splitClusters then delete this method.
Comment 1 Sucheta Ghoshal 2014-10-07 15:44:46 UTC
Well, if we are going beyond BMP, we need to take surrogate pairs into account, too. JavaScript does not do that natively, AFAIK. Would code points be enough in that case?
Comment 2 Roan Kattouw 2014-10-07 15:46:19 UTC
(In reply to Sucheta Ghoshal from comment #1)
> Well, if we are going beyond BMP, we need to take surrogate pairs into
> account, too. JavaScript does not do that natively, AFAIK. Would code points
> be enough in that case?

I think the idea is that at the DM layer, we won't be combining anything beyond code points, but David (CC) can confirm.
Comment 3 D Chan 2014-10-07 15:57:13 UTC
Hi, yes, we wrote ve.splitClusters because we wanted to make the document model be a list of grapheme clusters, instead of a list of raw Javascript characters (i.e. Unicode code units, so each surrogate pair like '\uD860\uDEE2' is treated as two separate entities '\uD860' and '\uDEE2').

However, we've subsequently decided against that, because browsers will not always agree what constitutes a grapheme cluster. The example of Malayalam, where the font can affect the number of clusters, is one example of how problematic it could be to try to match the browser's clusterings exactly.

Therefore, the DM is to remain a list of raw Javascript characters, and support related to clustering is being developed at a level on top of the DM.
Comment 4 Gerrit Notification Bot 2014-10-08 08:08:09 UTC
Change 165433 had a related patch set uploaded by SuchetaG:
Getting rid of ve.splitClusters in VE core

https://gerrit.wikimedia.org/r/165433
Comment 5 Gerrit Notification Bot 2014-10-08 08:09:20 UTC
Change 165430 had a related patch set uploaded by SuchetaG:
Getting rid of ve.splitClusters in ve-mw

https://gerrit.wikimedia.org/r/165430
Comment 6 Gerrit Notification Bot 2014-10-08 08:13:48 UTC
Change 165430 merged by jenkins-bot:
Getting rid of ve.splitClusters in ve-mw

https://gerrit.wikimedia.org/r/165430
Comment 7 Gerrit Notification Bot 2014-10-08 08:14:30 UTC
Change 165433 merged by jenkins-bot:
Getting rid of ve.splitClusters in VE core

https://gerrit.wikimedia.org/r/165433

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links