Last modified: 2014-04-25 06:44:40 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T51400, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 49400 - Images: Handle templated image options containing pipes
Images: Handle templated image options containing pipes
Status: NEW
Product: Parsoid
Classification: Unclassified
token-stream transforms (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: ssastry
:
Depends on:
Blocks: ve-nonenglish
  Show dependency treegraph
 
Reported: 2013-06-10 18:40 UTC by Gabriel Wicke
Modified: 2014-04-25 06:44 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Gabriel Wicke 2013-06-10 18:40:29 UTC
Some perverse templates such as {{Largethumb}} return a string with several options separated by pipes ("thumb|<number>px"). In the PHP parser those pipes are then picked up by the link regexp, so result in the regular image options.

In our attribute handling we expand attributes separately, so can end up with strings like the above in a single attribute value. In [[:de:Portal:Thüringen]] for example that is the case for the href attribute, which then caused a crash after retrieving image properties.

We will likely need to handle expanded attributes containing a pipe properly by re-parsing their string representation. This will be difficult to map onto our current template-affected attribute handling.
Comment 1 Gerrit Notification Bot 2013-06-10 18:41:38 UTC
Related URL: https://gerrit.wikimedia.org/r/67853 (Gerrit Change Ic7652979ec3d5fa35995aac9362c812ab4ede173)
Comment 2 Gerrit Notification Bot 2013-06-10 18:52:32 UTC
https://gerrit.wikimedia.org/r/67853 (Gerrit Change Ic7652979ec3d5fa35995aac9362c812ab4ede173) | change APPROVED and MERGED [by jenkins-bot]
Comment 3 John Mark Vandenberg 2013-07-23 09:29:51 UTC
There are around 110 uses on enwp
https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/Template:Largethumb&limit=110
They are all in the article, and most of them are [[491]]-[[635]] (years).  I'm guessing they can be subst'd without much concern, to converted to normal thumbs.

On nlwp, it is way over 5000 uses
https://nl.wikipedia.org/w/index.php?title=Speciaal:VerwijzingenNaarHier/Sjabloon:Largethumb&limit=5000

They are in the pages; not in a template, and nlwp doesnt appear keen to subst: them.  Has anyone proposed that their default thumbnail be set to this value?

only four uses here:
https://nds-nl.wikipedia.org/wiki/Spesiaal:Verwiezingen_naor_disse_pagina/Mal:Largethumb
Comment 4 ssastry 2013-07-23 16:47:36 UTC
Ugh, I guess we have to get to this sooner than later then and find a reasonable solution.
Comment 5 John Mark Vandenberg 2013-07-24 00:58:26 UTC
Except for eight cases, the parser fails for nl.wp pages at http://parsoid.wmflabs.org:8001/topfails all contain '{{largethumb}}'.
Comment 6 John Mark Vandenberg 2013-07-24 03:18:09 UTC
~ 1 in 225 nlwp articles (0.45%) have this bug.  This bug affects the whole page.
Comment 7 ssastry 2013-07-24 03:29:52 UTC
There are two independent issues here, it appears.

1. Image attributes that are template-affected are not properly marked up during parse and hence doesn't rt back to its original form.  This is an issue independent of {{largethumb}} usage.  Once we fix this, even if the {{largethumb}} template doesn't parse correctly, it will RT correctly. This may be easier to fix.  Will create a separate bug report for it.

[subbu@earth tests] echo '[[File:Telephone exchange 1.svg|thumb|{{echo|261px}}]]' | node parse --wt2wt
[[File:Telephone exchange 1.svg|thumb|261px]]

2. Proper rendering of the image in VE depends on us correctly handling the largethumb template (and other such templated image options containing pipes).
Comment 8 Gerrit Notification Bot 2013-07-30 20:21:20 UTC
Change 76821 had a related patch set uploaded by Cscott:
Add new parserTests for image attributes coming from templates.

https://gerrit.wikimedia.org/r/76821
Comment 9 Gerrit Notification Bot 2013-07-30 21:05:14 UTC
Change 76821 merged by jenkins-bot:
Add new parserTests for image attributes coming from templates.

https://gerrit.wikimedia.org/r/76821
Comment 10 Gerrit Notification Bot 2014-02-05 22:31:27 UTC
Change 110095 had a related patch set uploaded by Subramanya Sastry:
WIP: Set up data-mw for all templated image attributes.

https://gerrit.wikimedia.org/r/110095
Comment 11 ssastry 2014-02-06 17:34:16 UTC
Here is the latest on this. 

https://gerrit.wikimedia.org/r/#/c/110095/ provides support for templated image options. It identifies individual image options that came from a template and provides VE sufficient information to make them editable. Parsoid can then serialize the edited attributes back to transclusions. See https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec#Transclusion-affected_attributes for how Parsoid conveys this information to clients (VE).

However, {{largethumb}} is still a special case. That gerrit patch also adds support for parsing largethumb as two image options. However, Parsoid cannot provide editing support for it, i.e. we cannot mark up both "thumb" and "261px" as coming from templates, that they are connected, and that VE cannot edit them individually. As you can see from the DOM spec (linked above), the spec is already fairly complex and we do not want to complicate the spec further to support this edge case. This is not just a complication for Parsoid, but for VE, and any other clients that need to deal with this spec.

That said, there are two options for how we can move forward.

1. Parsoid can mark images from {{largethumb}} uneditable so that VE doesn't attempt to deal with that template. This also means that the wikitext for these images will always be preserved and can only be edited in wikitext mode.

2. Parsoid makes the image editable. When the page is edited in VE, Parsoid preserves the image including {{largethumb}} transclusion when the image itself is not edited (even if other parts of the page are edited - which should be the vast majority of edits). But, if the image itself is edited, the {{largethumb}} template will effectively be subst-ed because Parsoid flattens out {{largethumb}} to regular options: "thumb" and "261px".

As of now, gerrit patch 110095 implements solution 1. as an initial step. However, in the long run, we feel solution 2 is a better solution that  provides editing support and also gradually moves towards reduced largethumb usage (by subst-ing largethumb images edited in VE). John Vandenburg has suggested in comment 3 (#c3 above) that if nlwiki set 261px as the default thumbnail size, editors could simply use |thumb instead of |{{largethumb}} and get the same effect. Not sure if that is viable, but collating all suggestions here in one place.

It would be good to get some input on these options.
Comment 12 Gerrit Notification Bot 2014-02-14 21:37:24 UTC
Change 110095 merged by jenkins-bot:
Set up data-mw for all templated image attributes.

https://gerrit.wikimedia.org/r/110095
Comment 13 John Mark Vandenberg 2014-02-14 23:54:30 UTC
This problem exists because nl.wp wants two thumb sizes. Why not add 'largethumb' as a core part of syntax? i.e. Has that approach been explored and rejected?
Comment 14 Andre Klapper 2014-04-25 06:44:40 UTC
All patches merged; resetting bug status

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links