Last modified: 2013-08-26 20:23:02 UTC
Many DOM passes depend on an accurate identification of foster-parented content (see http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#foster-parent). We have implemented some detection already in dom.markFosteredContent.js, but also still depend on a hack (used to be a convenient bug) in the HTML5 treebuilder that disables fostering for meta tags. It would be great if we could generalize and improve the existing algorithm so that - it can be run as a first pass on the DOM, - its marking of fostered content can be relied upon by all other DOM passes, - it properly detects fostering of pure text, and - we can remove the no-fostering-for-metas hack from the HTML5 treebuilder. Fostered text detection can probably be addressed with this trick: For each <table> TagTk, we can pre-pend a <meta typeof="mw:FosterMarker"> SelfclosingTagTk just before adding a tagId sequence number and feeding those tokens to the treebuilder. This will then create a 'fostering box' in the DOM: content.. <meta typeof="mw:FosterBox" data-parsoid="{tagId: 3}"/> potentially fostered content <table data-parsoid="{tagId: 4}">..</table> Fostered element content will have higher tagIds than both the meta and the table. A complication we should ignore for now is cases like <table><meta><table>..- lets tackle those rare edge cases later. The goal is to mark all fostered content with data.parsoid.fostered. Fostered text nodes need to be wrapped into a span for this. The extra meta tags for fostering detection should be stripped so that they don't interfere with later passes.
A clarification: The hack (convenient bug) in the HTML5 treebuilder that disables fostering of meta tags is used for a different pass (markTreeBuilderFixups) and is independent of the task in this bug -- which is accurate detection of fostered tags. Consequently, step 4. (we can remove no-fostering-for-metas hack) can be implemented separately from the task here -- we can create a new bug for it and outline the problems and requirements there.
The fourth step (re-enabling foster-parenting) will definitely require more work than just implementing fostering detection, but it should be significantly easier once reliable fostering info is available. Lets create a separate bug for that once we get close to tackling it.
Change 80675 had a related patch set uploaded by Arlolra: WIP: Generalize foster parented content detection https://gerrit.wikimedia.org/r/80675
Change 80675 merged by jenkins-bot: Generalize foster parented content detection https://gerrit.wikimedia.org/r/80675