Last modified: 2014-09-26 20:08:10 UTC
http://parsoid.wmflabs.org/dewiki/Englische_Sprache?oldid=134283860 contains <li> tags outside any containing <ul> in the "Länder der Welt, in denen Englisch gesprochen wird" figure. This also causes a visual diff, presumably as a result of the missing container tag. From IRC: subbu: surprised that the html tree builder didn't fix it. cscott-free: yeah, me too. but maybe it's peculiar to <figure> parsing somehow.
This seems to be happening because the dom-fragment for the caption is <li>..</li><li>..</li> and when the dom-fragment is unwrapped and inserted into the parent DOM, the the <li>s aren't fixed up. This is a problem with our dom-fragment unpacker which uses some heuristics to make sure the parent dom is well-formed. Reproducible with: echo "[[Image:Foobar.jpg|right|this is a caption {{echo|<li>foo</li>}}]]" | node parse
I was mistaken. Looks like the parsing spec doesn't dictate that bare <li> nodes be fixed up to be enclosed in ul/ol nodes. So, while Tidy does fix up these uses in the PHP parser scenario, we can deprecate such uses for now and fix up source wikitext where possible. If this is deemed to be a problem, we can probably handle this as part of a generic "content-model-fixup" pass that takes care of these and other issues. But for now, this is going to be a lower-priority issue to tackle, and we can continue to fixup individual instances of problematic wikitext wherever it shows up.