Last modified: 2013-07-04 10:38:28 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T42243, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 40243 - Hook up the libhubbub tree builder and libxml DOM
Hook up the libhubbub tree builder and libxml DOM
Status: RESOLVED FIXED
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Adam Wight
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-13 23:24 UTC by Gabriel Wicke
Modified: 2013-07-04 10:38 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Gabriel Wicke 2012-09-13 23:24:17 UTC
Hack up libhubbub in a way that makes the tree builder callable from the outside and build a libxml2 DOM from it. A class with a method taking a TokenChunkPtr (or a TokenMessage, but that is easy to adapt) would be ideal.

This will likely involve conversion of tokens to the libhubbub format. Because of the arena-like libhubbub memory management strategy actually only a single stack-allocated token is needed. There is a libxml2 binding example in the libhubbub source we could adopt. It does some unnecessary strduping, since libxml implicitly copies its input while constructing the DOM.

We will also likely need to add some features to our version of the tree builder. The main feature planned currently is the propagation of attributes from end tags to the resulting element.

We will also need a second HTML parser and DOM builder for the HTML to Wikitext conversion. This could be the default (unpatched) libhubbub parser.
Comment 1 db [inactive,noenotif] 2012-11-24 21:46:19 UTC
Merged Gerrit change #26413 links here, bug maybe resolved
Comment 2 Andre Klapper 2013-07-04 10:38:28 UTC
[Parsoid component reorg by merging CPP/* tickets into General. See bug 50685 for more information. Filter bugmail on this comment. parsoidreorg20130704]

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links