Last modified: 2012-05-15 14:51:59 UTC
This is a fairly bizarre error... it only appears when querying 16 or more pages on the API. If the page name has a certain UTF character in it ("-", a.k.a. E28093 in UTF-8 hex), the API mangles the title and then states that said mangled title doesn't exist. Here is the reproduction: *Mangled result, 16 files being queried: http://commons.wikimedia.org/w/api.php?action=query&titles=File:Krizius%204.jpg|File:Abraszewski%20Bayview.jpg|File:Abraszewski%20flowers%20small.jpg|File:Abraszewski%20gray%20mansion%20small.jpg|File:Saasveld04.jpg|File:Oyama-jinja%20004.jpg|File:Sila%20o%20Tonga%20-%20Coat%20of%20arms%20of%20the%20Kingdom%20of%20Tonga.svg|File:Ft-Banks-1946-1953-C.pdf|File:Ounguicularis.jpg|File:Royal%20Dublin%20Fusileers.jpg|File:Edsim%20Vascular.jpg|File:Flag%20Dubrovnik%E2%80%93Neretva%20County.gif|File:1824%20laver%20coral.jpg|File:1928%20new%20chambers.jpg|File:1933%20Thicknesse%20w480.jpg|File:2-10%20Armoured%20Regt%20(AWM%20043801).jpg&prop=imageinfo|revisions|templates&iiprop=sha1 **Notice at the top the API returns the result: <page ns="6" title="File:Flag Dubrovnik–Neretva County.gif" missing="" imagerepository="" /> **Thus the dash character has been mangled into mojibake *Now, to create a non-mangled result, remove any one of the other files being queried in the above result (only 15 files being queried) **Removing the last of the files from the list(File:...(AWM%20043801).jpg): http://commons.wikimedia.org/w/api.php?action=query&titles=File:Krizius%204.jpg|File:Abraszewski%20Bayview.jpg|File:Abraszewski%20flowers%20small.jpg|File:Abraszewski%20gray%20mansion%20small.jpg|File:Saasveld04.jpg|File:Oyama-jinja%20004.jpg|File:Sila%20o%20Tonga%20-%20Coat%20of%20arms%20of%20the%20Kingdom%20of%20Tonga.svg|File:Ft-Banks-1946-1953-C.pdf|File:Ounguicularis.jpg|File:Royal%20Dublin%20Fusileers.jpg|File:Edsim%20Vascular.jpg|File:Flag%20Dubrovnik%E2%80%93Neretva%20County.gif|File:1824%20laver%20coral.jpg|File:1928%20new%20chambers.jpg|File:1933%20Thicknesse%20w480.jpg&prop=imageinfo|revisions|templates&iiprop=sha1 **Removing the first of the files from the list (File:Krizius%204.jpg): http://commons.wikimedia.org/w/api.php?action=query&titles=File:Abraszewski%20Bayview.jpg|File:Abraszewski%20flowers%20small.jpg|File:Abraszewski%20gray%20mansion%20small.jpg|File:Saasveld04.jpg|File:Oyama-jinja%20004.jpg|File:Sila%20o%20Tonga%20-%20Coat%20of%20arms%20of%20the%20Kingdom%20of%20Tonga.svg|File:Ft-Banks-1946-1953-C.pdf|File:Ounguicularis.jpg|File:Royal%20Dublin%20Fusileers.jpg|File:Edsim%20Vascular.jpg|File:Flag%20Dubrovnik%E2%80%93Neretva%20County.gif|File:1824%20laver%20coral.jpg|File:1928%20new%20chambers.jpg|File:1933%20Thicknesse%20w480.jpg|File:2-10%20Armoured%20Regt%20(AWM%20043801).jpg&prop=imageinfo|revisions|templates&iiprop=sha1 **On both above results, the API correctly returns the result: <page pageid="25721149" ns="6" title="File:Flag Dubrovnik–Neretva County.gif" imagerepository="local">...</page> I have literally never encountered this error for any other file, and my bot has queried a LOT of files, so I don't know how many different utf-8 characters the API will mangle.
... and the same issue is occurring with File:José de Ribera-St Sebastian.jpg (http://en.wikipedia.org/wiki/File:Jos%C3%A9_de_Ribera-St_Sebastian.jpg). Is there maybe a new feature that is bugging up English Wikipedia?
Maybe it is depending on the length of the request? See bug 36839
I know the dupe should really be the other way around, but it looks like everyone's attention is on the other bug, so I'll mark this one as duped instead. *** This bug has been marked as a duplicate of bug 36839 ***