Last modified: 2013-12-04 02:12:21 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T59361, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 57361 - Expand TSR array in tokenizer to include start/end-tag widths
Expand TSR array in tokenizer to include start/end-tag widths
Status: NEW
Product: Parsoid
Classification: Unclassified
tokenizer (Other open bugs)
unspecified
All All
: Normal enhancement
: ---
Assigned To: Gabriel Wicke
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-21 17:46 UTC by ssastry
Modified: 2013-12-04 02:12 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description ssastry 2013-11-21 17:46:47 UTC
Currently, DSR computation attempts to infer opening/closing wikitext tag widths for generated HTML nodes based on existing tsr value and knowledge of wikitext. While this works reasonably well currently, we occasionally run into bugs when there are changes made (ex: https://gerrit.wikimedia.org/r/#/c/95696/). We could possibly pass along information from tokenizer where wikitext tag widths are available and use that in DSR computation. 

This will not be a trivial fix because:
* a lot of tokenizer productions will need fixing
* we need to make sure not to break the carefully tuned code in list handler that
  adjusts tsr values based on how bullets are parsed to lists
* dsr computation might need some tweaking based on this

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links