Last modified: 2014-08-01 18:08:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T70989, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 68989 - Large generator results in XML give error instead of truncating
Large generator results in XML give error instead of truncating
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
1.22.2
All All
: Lowest trivial (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-08-01 06:41 UTC by Robert Morley
Modified: 2014-08-01 18:08 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Robert Morley 2014-08-01 06:41:32 UTC
Generator results that are in excess of $wgAPIMaxResultSize will cause an error in XML rather than truncating. This seems to be occuring when the Page elements themselves cause the result size to be in excess of the maximum.

For example, setting $wgAPIMaxResultSize to some arbitrarily low value for demonstration, like 100,000, and doing a standard generator=allpages query, instead of results, you get:

<error code="internal_api_error_MWException" info="Exception Caught: Internal error in ApiFormatXml::recXmlPrint: (pages, ...) has integer keys without _element value. Use ApiResult::setIndexedTagName()." xml:space="preserve">

Interestingly, this occurs regardless of what properties you request, and only occurs with gaplimits over a certain value, regardless of the actual result size (though the gaplimit required to cause the error increases as the max result size does). Both XML and JSON will happily return result sets well above the designated result size (which could be considered another bug), but JSON always seems to return results successfully, where XML fails beyond a certain gaplimit.

Since this becomes nearly impossible to replicate with more normal values of $wgAPIMaxResultSize, I'm marking it as trivial.
Comment 1 Robert Morley 2014-08-01 06:43:10 UTC
Woops, ignore that first paragraph. I'd meant to rewrite it before submitting. The details later are more reflective of the actual behaviour.
Comment 2 Brad Jorsch 2014-08-01 15:01:51 UTC
The problem isn't that it's hitting the result size limit, it's that something is exiting early and not setting the metadata that the XML formatter requires.

Without information on the specific query you're seeing the problem with, it's difficult to debug this further. I haven't been able to reproduce it locally.

(In reply to Robert Morley from comment #0)
Both XML and JSON will happily return result sets
> well above the designated result size (which could be considered another
> bug)

$wgAPIMaxResultSize isn't trying to limit the size of the entire response, it's trying to limit the amount of data being returned so someone can't cause issues by doing something like requesting rvprop=content for numerous enormous pages.
Comment 3 Robert Morley 2014-08-01 15:16:28 UTC
As a simple test, with $wgAPIMaxResultSize set to 100,000, I was using:

?action=query&generator=allpages&prop=info&gaplimit=5000

It gives the truncation warning and then errors out. Reducing the gaplimit to 2500, the query worked and gave the expected truncation warning, but no error. I'll e-mail you the actual server, but I don't want to post it publicly just so everybody doesn't go hitting it. It's a development server that's not really meant for everybody and their dog to be launching massive queries. :)
Comment 4 Brad Jorsch 2014-08-01 16:27:01 UTC
Ok, I see what's going on here. The 'pages' array is supposed to be set up by ApiQuery, but your too-small value for $wgAPIMaxResultSize is preventing that. Then prop=info adds some entries to the 'pages' array, assuming it was already set up.

Unless you've set $wgMaxArticleSize to very low, your value for $wgAPIMaxResultSize is lower than is even supported. But it would probably be good for the API to actually check it.
Comment 5 Gerrit Notification Bot 2014-08-01 16:27:27 UTC
Change 151103 had a related patch set uploaded by Anomie:
Check for result size failure in ApiQuery

https://gerrit.wikimedia.org/r/151103
Comment 6 Robert Morley 2014-08-01 17:31:59 UTC
(In reply to Brad Jorsch from comment #4)
Thanks for looking into it. I figured the low value of $wgAPIMaxResultSize was coming into play, as the maximum gaplimit value increased as the max size did. I had noticed that the recommended lower limit was related to $wgMaxArticleSize, but I didn't realize that it would have any significant effect on a query; I assumed that it had more to do with being able to return at least one revision when requested.

This is what I get for trying to test my own generator module...I make everything fail! ;) At least now I know that it's nothing I have to worry about in normal usage. Thanks again!
Comment 7 Gerrit Notification Bot 2014-08-01 17:42:36 UTC
Change 151103 merged by jenkins-bot:
Check for result size failure in ApiQuery

https://gerrit.wikimedia.org/r/151103
Comment 8 Brad Jorsch 2014-08-01 18:08:20 UTC
This is now fixed in the current git master.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links