Last modified: 2014-06-13 16:54:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T68487, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 66487 - img tags are not well formed on main page
img tags are not well formed on main page
Status: RESOLVED FIXED
Product: MobileFrontend
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-06-11 16:26 UTC by Bernd Sitzmann
Modified: 2014-06-13 16:54 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bernd Sitzmann 2014-06-11 16:26:06 UTC
This breaks our XML parse when we try to detect images for the saved pages feature in the Android Wikipedia app.

example from yesterdays main page:

<div style="float:right;margin-left:0.5em;"><a href="/wiki/File:Maria_Sharapova,_December_2008.jpg" class="image" title="Maria Sharapova">
<img alt="Maria Sharapova in 2008" src="//upload.wikimedia.org/wikipedia/en/thumb/c/c6/Maria_Sharapova%2C_December_2008.jpg/61px-Maria_Sharapova%2C_December_2008.jpg" width="61" height="100" class="thumbborder" srcset="//upload.wikimedia.org/wikipedia/en/thumb/c/c6/Maria_Sharapova%2C_December_2008.jpg/91px-Maria_Sharapova%2C_December_2008.jpg 1.5x, //upload.wikimedia.org/wikipedia/en/thumb/c/c6/Maria_Sharapova%2C_December_2008.jpg/121px-Maria_Sharapova%2C_December_2008.jpg 2x" data-file-width="405" 
data-file-height="667"></a></div>

The img tag should end in 
data-file-height="667"/>
Comment 1 Bingle 2014-06-11 16:30:15 UTC
Prioritization and scheduling of this bug is tracked on Trello card https://trello.com/c/XeXwx61H
Comment 2 Max Semenik 2014-06-11 16:36:56 UTC
We output HTML5 and thus it's not supposed to be a valid XML. Any alternative parsers around?
Comment 3 Yuvi Panda 2014-06-11 16:42:09 UTC
Does adding closing tags prevent it from being HTML5 compliant?
Comment 4 Bernd Sitzmann 2014-06-11 17:12:34 UTC
Other pages have closed tags, only the main page is different.
Comment 5 Jon 2014-06-11 17:33:00 UTC
Isn't this an issue with the parser? I think assuming it is XML is not very future proof. Hopefully one day all of Wikipedia will be HTML5 which doesn't need closed tags.
Comment 6 Ryan Kaldari 2014-06-11 17:51:33 UTC
Just to clarify, self-closing img tags are not necessary in HTML5, but they are allowed and considered valid:
http://dev.w3.org/html5/spec-author-view/syntax.html#syntax-start-tag
Comment 7 Brion Vibber 2014-06-11 18:06:01 UTC
Are you using libxml2 here? There should be an HTML parsing mode which groks the implied-closed img elements I think
Comment 8 Max Semenik 2014-06-11 18:24:02 UTC
Main page is different because we rewrite it for mobile, but there's no guarantee that other pages will not be processed too e.g. for image removal or Zero.
Comment 9 Bernd Sitzmann 2014-06-13 16:54:24 UTC
I'll retract that issue from and Android app point of view since we switched away from using an XML parser. Using Html.fromHtml() with a custom Html.ImageGetter.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links