Last modified: 2014-01-02 20:21:20 UTC
PHP parser does not recognize "|}" as a table closing tag on a non-empty line (which is how we end up with a pages on WPs with stray trailing |} wikitext on some lines). However, Parsoid recognizes them as a valid closing tag which then causes us to spectacularly bomb on those pages (Parsoid tries to recover and fix up, etc. but which doesn't always work). The right fix is to fix the tokenizer to require "|}" to be on a new line (leading whitespace and and other sol-transparent text should be fine).
This will also require fixing the Parsoid serializer to emit "|}" on new lines.
Change 103572 had a related patch set uploaded by Subramanya Sastry: (Bug 57360) Fix parser/serializer to accept/emit "|}" in SOL posns https://gerrit.wikimedia.org/r/103572
Change 103572 merged by jenkins-bot: (Bug 57360) Fix serializer to emit "|}" in SOL posn https://gerrit.wikimedia.org/r/103572
Followup patch coming from gwicke. <gwicke> Re the {| |} issue, I re-did my grep search with a better regexp and am now finding quite a few matches that look like {| <some attributes |} <gwicke> the PHP parser strips the end tag in those cases, so maybe we should just strip it too? <gwicke> {| class="wikitable"|} is a construct I see repeatedly <gwicke> also {| class="wikitable"|}" style="text-align:center" <gwicke> would be interesting to see where that was all copy & pasted from ;) <gwicke> {|border=1 align=left cellpadding=0 cellspacing=0 style="width: 48%" {{Election city polls FPTP begin|locale = town| title=[[Canadian federal election, 2006]]<br>Hudson's Hope polls in Prince George—Peace River<ref name=06fed/>}}|} <gwicke> just dropping the end tag token should be good enough I think <gwicke> and accepting it anywhere in the attribute sequence <gwicke> can write a patch for that
Change 105019 had a related patch set uploaded by GWicke: Bug 57360: Eat stray table end tags in table start tag attributes https://gerrit.wikimedia.org/r/105019
Change 105019 merged by jenkins-bot: Bug 57360: Eat stray table end tags in table start tag attributes https://gerrit.wikimedia.org/r/105019