Last modified: 2013-08-26 20:23:02 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T55110, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 53110 - Generalize foster parented content detection in early DOM postprocessor pass
Generalize foster parented content detection in early DOM postprocessor pass
Status: RESOLVED FIXED
Product: Parsoid
Classification: Unclassified
DOM (Other open bugs)
unspecified
All All
: High normal
: ---
Assigned To: Arlo Breault
:
Depends on:
Blocks: 52945
  Show dependency treegraph
 
Reported: 2013-08-20 19:29 UTC by Gabriel Wicke
Modified: 2013-08-26 20:23 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Gabriel Wicke 2013-08-20 19:29:16 UTC
Many DOM passes depend on an accurate identification of foster-parented content (see http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#foster-parent). We have implemented some detection already in dom.markFosteredContent.js, but also still depend on a hack (used to be a convenient bug) in the HTML5 treebuilder that disables fostering for meta tags.

It would be great if we could generalize and improve the existing algorithm so that 

- it can be run as a first pass on the DOM,
- its marking of fostered content can be relied upon by all other DOM passes,
- it properly detects fostering of pure text, and
- we can remove the no-fostering-for-metas hack from the HTML5 treebuilder.

Fostered text detection can probably be addressed with this trick:

For each <table> TagTk, we can pre-pend a <meta typeof="mw:FosterMarker"> SelfclosingTagTk just before adding a tagId sequence number and feeding those tokens to the treebuilder. This will then create a 'fostering box' in the DOM:

content..
<meta typeof="mw:FosterBox" data-parsoid="{tagId: 3}"/>
potentially fostered content
<table data-parsoid="{tagId: 4}">..</table>

Fostered element content will have higher tagIds than both the meta and the table.

A complication we should ignore for now is cases like <table><meta><table>..- lets tackle those rare edge cases later.

The goal is to mark all fostered content with data.parsoid.fostered. Fostered text nodes need to be wrapped into a span for this. The extra meta tags for fostering detection should be stripped so that they don't interfere with later passes.
Comment 1 ssastry 2013-08-20 19:37:12 UTC
A clarification:  The hack (convenient bug) in the HTML5 treebuilder that disables fostering of meta tags is used for a different pass (markTreeBuilderFixups) and is independent of the task in this bug -- which is accurate detection of fostered tags.   Consequently, step 4. (we can remove no-fostering-for-metas hack) can be implemented separately from the task here -- we can create a new bug for it and outline the problems and requirements there.
Comment 2 Gabriel Wicke 2013-08-20 20:56:41 UTC
The fourth step (re-enabling foster-parenting) will definitely require more work than just implementing fostering detection, but it should be significantly easier once reliable fostering info is available. Lets create a separate bug for that once we get close to tackling it.
Comment 3 Gerrit Notification Bot 2013-08-23 21:36:44 UTC
Change 80675 had a related patch set uploaded by Arlolra:
WIP: Generalize foster parented content detection

https://gerrit.wikimedia.org/r/80675
Comment 4 Gerrit Notification Bot 2013-08-26 20:22:06 UTC
Change 80675 merged by jenkins-bot:
Generalize foster parented content detection

https://gerrit.wikimedia.org/r/80675

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links