Last modified: 2014-11-18 10:19:43 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T69850, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 67850 - Parsoid: Template-generated partial table cell markup incorrectly parsed under specific circumstances
Parsoid: Template-generated partial table cell markup incorrectly parsed unde...
Status: REOPENED
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Lowest normal
: ---
Assigned To: Parsoid Team
https://nl.wikipedia.org/w/index.php?...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-07-11 12:17 UTC by Ad Huikeshoven
Modified: 2014-11-18 10:19 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Ad Huikeshoven 2014-07-11 12:17:35 UTC
Some articles like https://nl.wikipedia.org/wiki/Justine_Henin?veaction=edit and other articles about tennis players have advanced tables like https://nl.wikipedia.org/wiki/Justine_Henin#Prestatietabel_enkelspel. Those tables use a table cell template heavily https://nl.wikipedia.org/wiki/Sjabloon:Tabelcel_tennis.

In VE edit mode the cells of the table with that table cell template show raw format code like "align="center" style="background:#F9F9F9;"|". Users expect VE edit mode to look almost exactly as in read mode. Now, it doesn't.

Bug 44498 is fixed, which was for a related template table cell problem on en.wp. Somehow that didn't fix the template table cell issue on nl.wp. 

Possible solutions:
* Rewriting the articles without using the template (editors on nl.wp don't like that and prefer to use the template);
* Slight adaptation of the template (and subsequent rewriting of the articles) so it becomes parsable; (requires a lot of consultation with editors on nl.wp)
* Have a tag in the template notifying Parsoid the template contains partial table cell data starting with formatting code to receive a hint what to do with the template;
* Apply solutions method of bug 44498 for this context;
* Your creative inspiration.

(This is one of a few major stumble blocks in turning VE on at nl.wp. The other one is hidden templates and extra carriage returns which might lead to accidental deletion of templates/images when someone removes a blank line.)
Comment 1 Bartosz Dziewoński 2014-07-11 12:50:40 UTC
While Parsoid's rendering is obviously not correct (this isn't a VE-only issue, the raw rendering shows the same problems: http://parsoid.wmflabs.org/nlwiki/Justine_Henin?oldid=41676266), the template could easily be simplified.

There are many templates that generate partial table cell markup on various wikis that work, so it's just some quirk of this one that Parsoid is unable to understand.
Comment 2 Bartosz Dziewoński 2014-07-11 12:55:10 UTC
Here's a version of the template with simpler markup that will probably be parsed correctly (but I haven't tested). It has less # and less {{!}} :).

I didn't want to modify it on the wiki in case it actually broke something (I tried it with some pages and it was okay), feel free to use this code.

(I don't work on Parsoid, I just found this interesting.)

align="center" {{#switch: {{{1|}}}
| -        = style="background: #ffffff;"
| G        = style="background: #def8fe;"
| 1R       = style="background: #def8fe;"
| 2R       = style="background: #bef1fe;"
| 3R       = style="background: #8cecfd;"
| 4R       = style="background: #68dffd;"
| KF       = style="background: #ffebcd;"
| HF       = style="background: #dddd00;"
| F        = style="background: #e4c8e4;"
| W        = style="background: #00ff00;"
| #default = style="background: #f9f9f9;"
}} | {{#if: {{{2|}}}
| {{#ifeq: {{{1}}} | W | '''[[{{{2}}}|{{{1}}}]]''' | [[{{{2}}}|{{{1}}}]] }}
| {{#ifeq: {{{1}}} | W | '''{{{1}}}''' | {{{1}}} }}
}}
Comment 3 Ad Huikeshoven 2014-07-11 13:12:35 UTC
(In reply to Bartosz Dziewoński from comment #1)

> http://parsoid.wmflabs.org/nlwiki/Justine_Henin?oldid=41676266), the
> template could easily be simplified.

Thanks for looking into it. You suggest to consider simplifying the template. Do you have a link to a list of  partial table cell markup templates and/or a link with suggestions how to simplify the template?
Comment 4 Ad Huikeshoven 2014-07-11 13:13:56 UTC
After bwc: you've already coded a possible simplification. I will test that one. Thanks a lot!
Comment 5 Bartosz Dziewoński 2014-07-11 13:19:19 UTC
(In reply to Ad Huikeshoven from comment #3)
> Do you have a link to a list of  partial table cell markup
> templates and/or a link with suggestions how to simplify the template?

Here's a list of ones used on en.wikipedia (probably not a full list): https://en.wikipedia.org/wiki/Template:Yes#Templates – these are in general simpler than the one we're talking about here, but they seem to all render correctly in Parsoid (http://parsoid.wmflabs.org/enwiki/Template%3AYes?oldid=600725279).
Comment 6 Ad Huikeshoven 2014-07-11 13:43:04 UTC
https://nl.wikipedia.org/wiki/Gebruiker:Ad_Huikeshoven/jh?veaction=edit shows that the modified template https://nl.wikipedia.org/wiki/Sjabloon:Tabelcel_tennis/vereenvoudigd resolves the issue for at least one page.
Comment 7 Bartosz Dziewoński 2014-07-11 14:03:21 UTC
(Let's leave this open for a while and let Parsoid team look, maybe the underlying issue in Parsoid parser is easily fixable :) )
Comment 8 Bartosz Dziewoński 2014-07-11 14:53:38 UTC
(VE has some fun issues with these templates, filed as bug 67856 and bug 67857.)
Comment 9 ssastry 2014-07-11 15:37:01 UTC
I think the problem is with the # entity that is showing up in the table-css. Parsoid wraps entities in additional HTML markup which gets in the way of Parsoid's strategy to handle these templates.

[subbu@earth tests] cat /tmp/tbl
{|
| {{Tabelcel tennis|X}}
| {{Tabelcel tennis/vereenvoudigd|X}}
|}

In the output below, the background css gets broken up by the entity that is encountered there. Entities are not a problem in html attributes in general, but only where the table-cell attribute is first parsed as text (in templates like this where the table-cell and its css comes from 2 different places) and later patched up. I think fixing up the template is a better solution rather than add additional logic in Parsoid to undo the entity wrapping in these cases.

[subbu@earth tests] node parse --prefix nlwiki --dump tplsrc --trace peg < /tmp/tbl

....
=================================
Sjabloon:Tabelcel_tennis/vereenvoudigd
---------------------------------
align="center" style="background: #f9f9f9;" | X
---------------------------------
1-[peg]        | ---->   ["align=\"center\" style=\"background: #f9f9f9;\" | X"]
1-[peg]        | ---->   [{"type":"EOFTk"}]
=================================
Sjabloon:Tabelcel_tennis
---------------------------------
align="center" style="background:&#35;F9F9F9;"|X
---------------------------------
2-[peg]        | ---->   ["align=\"center\" style=\"background:",{"type":"TagTk","name":"span","attribs":[{"k":"typeof","v":"mw:Entity"}],"dataAttribs":{"src":"&#35;","srcContent":"#","tsr":[33,33]}},"#",{"type":"EndTagTk","name":"span","attribs":[],"dataAttribs":{"tsr":[38,38]}},"F9F9F9;\"|X"]
2-[peg]        | ---->   [{"type":"EOFTk"}]
....
Comment 10 Bartosz Dziewoński 2014-07-18 16:27:26 UTC
(Should this be closed as WONTFIX or WORKSFORME or something then?)
Comment 11 ssastry 2014-07-18 21:29:06 UTC
(In reply to Bartosz Dziewoński from comment #10)
> (Should this be closed as WONTFIX or WORKSFORME or something then?)

Hmm .. I dont know. Marking this WONTFIX can be seen as an antagonistic stance and hence my hesitation to mark it as such right now. We are not closing the door shut on this in case a cogent argument can be made for why this is a good investment of developer time instead of doing the 1-off update of the template and be done with this. It is better for nlwiki folks or other active editors/admins to close this as WORKSFORME after updating the template on nlwiki.
Comment 12 Kelson [Emmanuel Engelhart] 2014-10-24 17:31:26 UTC
Here what looks like to be a pretty bad looking example on idwiki:
http://parsoid-lb.eqiad.wikimedia.org/idwiki/Nickelodeon_Kids%27_Choice_Awards?oldid=8237629

I'm not sure to really understand/agree with the "lowest" priority for this bug as the impact is not low at all.
Comment 13 ssastry 2014-10-24 18:02:50 UTC
Kelson: your example is a bit different from the nlwiki scenario. Looks like we have an unhandled edge case. In the example below, the first row is handled, but the second isn't. Should be easy to fix.

{|
|-
!nowrap|Favorite Movie
| {{yes}}
|<!-- 1989 --> {{yes}}
|}
Comment 14 ssastry 2014-11-18 06:48:50 UTC
(In reply to ssastry from comment #13)
> Kelson: your example is a bit different from the nlwiki scenario. Looks like
> we have an unhandled edge case. In the example below, the first row is
> handled, but the second isn't. Should be easy to fix.
> 
> {|
> |-
> !nowrap|Favorite Movie
> | {{yes}}
> |<!-- 1989 --> {{yes}}
> |}

Fixed as bug 72487
Comment 15 Kelson [Emmanuel Engelhart] 2014-11-18 10:19:43 UTC
@ssastry
Yes 100% fixed. Thank you!

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links