Last modified: 2013-11-22 01:05:58 UTC
While clicking on images to test the viewer I ran into one that really broke the viewing experience. It turned out the metadata of this image was special, see HTML below. The image as found in: https://es.wikipedia.org/wiki/Wikipedia:Portada The image is: https://es.wikipedia.org/wiki/Archivo:Austerlitz-baron-Pascal.jpg The HTML for the element is below. <tr valign="top"> <th colspan="4" style="background-color:#e0e0ee; font-weight:bold; border:1px solid #aaa;"><span style="float: right; font-weight: normal; text-align: right; width: 6em;">[<a id="collapseButton0">ocultar</a>]</span><span class="fn" id="creator"><bdi><a href="//es.wikipedia.org/wiki/Fran%C3%A7ois_G%C3%A9rard" class="extiw" title="es:François Gérard">François Gérard</a></bdi></span> (1770–1837) <span class="wpImageAnnotatorControl wpImageAnnotatorOff"><a href="//commons.wikimedia.org/wiki/Creator:Fran%C3%A7ois_Pascal_Simon_G%C3%A9rard" title="Retroenlace a la plantilla de Ficha de Creador"><img alt="Retroenlace a la plantilla de Ficha de Creador" src="//upload.wikimedia.org/wikipedia/commons/thumb/7/73/Blue_pencil.svg/30px-Blue_pencil.svg.png" width="15" height="15" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/7/73/Blue_pencil.svg/23px-Blue_pencil.svg.png 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/7/73/Blue_pencil.svg/30px-Blue_pencil.svg.png 2x"></a></span> <span class="wpImageAnnotatorControl wpImageAnnotatorOff"><a href="//www.wikidata.org/wiki/Q163543" title="wikidata:Q163543"><img alt="wikidata:Q163543" src="//upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/40px-Wikidata-logo.svg.png" width="20" height="11" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/30px-Wikidata-logo.svg.png 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/40px-Wikidata-logo.svg.png 2x"></a></span></th> </tr>
So, first, this is an Artwork template, not Information like we're used to. We should probably look into stripping the HTML or handling Artwork and other templates in CMD.
Created attachment 13868 [details] Screen shot showing the problem
I'll try to handle this in CommonsMetadata. Lots of complex stuff can be in {{Information}} fields, see for example the source fields for these: https://www.mediawiki.org/wiki/File:Stroop_Report_-_Warsaw_Ghetto_Uprising_06b.jpg https://commons.wikimedia.org/wiki/File:BudapestMontage2..jpg As a first approximation: * if a field contains a table and some normal text, discard the table * otherwise, look for microformats in the table ({{Artist}} uses a hcard for example) * otherwise, dunno. Maybe just leave it empty?
Another data point, as I've seen a few of them and maybe there are a few formats that need to be consider. https://es.wikipedia.org/wiki/Archivo:Frida_Kahlo_Diego_Rivera_1932.jpg
Created attachment 13874 [details] Another broken picture.
And another one, hope this is useful. https://af.wikipedia.org/wiki/L%C3%AAer:Mimas_Cassini.jpg