Last modified: 2014-05-28 19:31:56 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T67603, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 65603 - Serializing images without data-parsoid normalizes image names incorrectly
Serializing images without data-parsoid normalizes image names incorrectly
Status: NEW
Product: Parsoid
Classification: Unclassified
serializer (Other open bugs)
unspecified
All All
: High normal
: ---
Assigned To: C. Scott Ananian
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-05-21 21:14 UTC by Roan Kattouw
Modified: 2014-05-28 19:31 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Roan Kattouw 2014-05-21 21:14:04 UTC
If you parse wikitext like [[File:Ulmer M?nster-1024409.jpeg|frame]] , Parsoid will produce a <figure> structure with href="./File:Ulmer_M%3Fnster-1024409.jpeg" (which is correct, the question mark is urlencoded) and resource="./File:Ulmer_M%3Fnster-1024409.jpeg" (not sure whether this is correct).

When VisualEditor takes this information to build an inline image equivalent (taking the href and the resource, and dropping everything else including data-parsoid) we get [[File:Ulmer_M%3Fnster-1024409.jpeg|link=]] back. Note the space changed to an underscore and ? changed to %3F. There was information in data-parsoid to keep this from happening, but in this transformation VE discards it.



$ echo '[[File:Ulmer M?nster-1024409.jpeg|frame]]' | node tests/parse.js --apiURL=http://localhost/w/api.php
<!DOCTYPE html>
<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/"><head prefix="mwr: http://localhost/wiki/Special:Redirect/"><meta property="mw:parsoidVersion" content="0"/><link rel="dc:isVersionOf" href="//localhost/wiki/Main_Page"/><title></title><base href="//localhost/wiki/Main_Page"/></head><body data-parsoid='{"dsr":[0,42,0,0]}'><figure class="mw-default-size" typeof="mw:Image/Frame" data-parsoid='{"optList":[{"ck":"framed","ak":"frame"}],"dsr":[0,41,2,2]}'><a href="./File:Ulmer_M%3Fnster-1024409.jpeg" data-parsoid='{"a":{"href":"./File:Ulmer_M%3Fnster-1024409.jpeg"},"sa":{},"dsr":[2,39,null,null]}'><img resource="./File:Ulmer_M%3Fnster-1024409.jpeg" src="//localhost/w/images/f/f1/Ulmer_M%3Fnster-1024409.jpeg" height="1024" width="764" data-parsoid='{"a":{"resource":"./File:Ulmer_M%3Fnster-1024409.jpeg","height":"1024","width":"764"},"sa":{"resource":"File:Ulmer M?nster-1024409.jpeg"}}'/></a></figure>
</body></html>

$ echo '<p><span typeof="mw:Image" class="mw-default-size"><span><img src="//localhost/w/images/f/f1/Ulmer_M%3Fnster-1024409.jpeg" resource="./File:Ulmer_M%3Fnster-1024409.jpeg" width="764" height="1024"></span></span></p>' | node tests/parse.js --html2wt
[[File:Ulmer_M%3Fnster-1024409.jpeg|link=]]
Comment 1 Gabriel Wicke 2014-05-28 19:31:49 UTC
It looks like we aren't percent-decoding & underscore-removing the resource properly in the serializer. This also seems to be the case for block images.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links