Last modified: 2014-07-24 21:57:23 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T53954, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 51954 - empty attribute should be discarded
empty attribute should be discarded
Status: UNCONFIRMED
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: Gabriel Wicke
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-24 10:52 UTC by John Mark Vandenberg
Modified: 2014-07-24 21:57 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description John Mark Vandenberg 2013-07-24 10:52:24 UTC
http://parsoid.wmflabs.org/_rt/de/Selbstbildnis_%28Leonardo_da_Vinci%29

gallery caption

becomes

gallery caption=""
Comment 1 John Mark Vandenberg 2013-07-24 11:17:48 UTC
http://parsoid.wmflabs.org/_rt/de/Iraklis_Thessaloniki

bgcolor=

becomes

bgcolor

(which _might_ mean another edit causes the 'bgcolor'-> 'bgcolor=""' in the next parse)
Comment 2 Chris McKenna 2013-07-30 19:54:03 UTC
bgcolor="" should either be left alone or should be deleted, as bgcolor 

In a table, bgcolor="" produces no background color but bgcolor produces a dark red background because it produces the html <td bgcolor="bgcolor"> (is this a bug in the default parser?)

See https://en.wikipedia.org/w/index.php?title=User:Thryduulf/sandbox&oldid=566470353#Table

This causes problems on the live wiki, see https://en.wikipedia.org/w/index.php?title=Kyle_Busch&diff=566076453&oldid=566076367 (the relevant change is the lines before the Line 709 diff block). Accordingly I've upgraded the severity from "trivial" to "normal".
Comment 3 Chris McKenna 2013-07-31 18:04:41 UTC
I've reported that parsing error as bug 52330 in the mediawiki parsing component as the example table in my sandbox was generated in the source editor and so had no involvement from parsoid aiui.
Comment 4 Gabriel Wicke 2013-08-14 00:28:30 UTC
I guess that bgcolor="" and bgcolor= should round-trip to bgcolor="" rather than just 'bgcolor'. Stripping the attribute completely does not seem to be a good solution in general.
Comment 5 Chris McKenna 2013-08-14 08:01:36 UTC
Yes. As noted at bug 52330 "bgcolor" generates the html "bgcolor="bgcolor" " which renders (at least in Firefox) the same as "bgcolor="#b00000" " rather than the expected "#f9f9f9" that is the default for tables of class "wikitable"
Comment 6 Gabriel Wicke 2014-01-16 02:42:43 UTC
This is what a modern browser (HTML5 parsing spec) does:

document.body.innerHTML = '<div bgcolor>foo</div>';
"<div bgcolor>foo</div>"
document.body.innerHTML 
"<div bgcolor="">foo</div>"

We do the same in parsoid as we are also using the HTML5 parsing algorithm.

So I think bug 52330 is really the issue here.

We should already be round-tripping any kind of attribute perfectly in untouched content. The normalization to bgcolor="" should only happen when something nearby was edited. Can you verify using the visual editor?

PS: When trying http://parsoid.wmflabs.org/_rtselser/dewiki/Selbstbildnis_%28Leonardo_da_Vinci%29 I noticed that there is a diff in ref tags which should not be there. This is reported in bug 60120.
Comment 7 ssastry 2014-07-24 21:57:23 UTC
So, Parsoid treats HTML and extension attributes slightly differently. See snippets below:

[subbu@earth lib] echo "A<ref name=>a</ref> B<ref name="">b</ref> C<ref name>c</ref> D<ref name='d'>d</ref>" | node parse --wt2wt
A<ref>a</ref> B<ref>b</ref> C<ref>c</ref> D<ref name="d">d</ref>

[subbu@earth lib] echo "<span title=>a</span><span title="">b</span><span title>c</span><span title="d">d</span>" | node parse --wt2wt
<span title="">a</span><span title="">b</span><span title="">c</span><span title="d">d</span>

For extensions, it drops empty attributes (in whatever form they show up), and for HTML tags, it normalizes empty attributes.

Our HTML attribute behavior conforms with what browsers do. Anything to change / fix here for extensions?

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links