Last modified: 2013-04-22 16:15:38 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T38400, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 36400 - Continue may skip or repeat entries for iwlinks or langlinks
Continue may skip or repeat entries for iwlinks or langlinks
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
1.20.x
All All
: Normal normal (vote)
: ---
Assigned To: Brad Jorsch
: schema-change
: 46136 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-02 02:53 UTC by Brad Jorsch
Modified: 2013-04-22 16:15 UTC (History)
10 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch to change the ORDER BY in the queries to match that expected by the continue handling (1.83 KB, patch)
2012-05-02 02:53 UTC, Brad Jorsch
Details

Description Brad Jorsch 2012-05-02 02:53:06 UTC
Created attachment 10495 [details]
Patch to change the ORDER BY in the queries to match that expected by the continue handling

For iwlinks, the continue parameter implies ORDER BY iwl_from, iwl_prefix, iwl_title. However, this does not match the actual ordering in two of the three cases:

1. When iwprefix is not given, the query is ORDER BY iwl_from, iwl_prefix (or ORDER BY iwl_prefix when iwl_from is constant). iwl_title is not sorted, so if the database would want to return three titles in the order A, C, B, with iwlimit=1 it would return A and set continue to C. When continuing, it would skip B because C > B.

2. When iwprefix is given but iwtitle is not given, the query is ORDER BY iwl_title, iwl_from, which is backwards (iwl_prefix need not be included, as it is constant). This is very easy to see the problem: just create pages A and B where B has an interwiki link that sorts between two interwiki links in A. https://en.wikipedia.org/w/api.php?format=jsonfm&action=query&prop=iwlinks&titles=A|C&iwprefix=wikisource&iwlimit=1 illustrates this at the moment.

3. When both iwprefix and iwltitle are specified, the query is ORDER BY iwl_from. This is fine because iwl_prefix and iwl_title are constant in the query.


The langlinks module has a similar structure, but here case 1 is not a problem because (ll_from, ll_prefix) is a unique key. Case 2 *is* still an issue.


The simple fix would be to change the queries so the ordering matches that implied by the continue parameter in all cases (see patch). I don't know if this will make MySQL filesort, however.
Comment 1 Sumana Harihareswara 2012-05-07 01:44:07 UTC
Thanks for the patch, Brad.  It'll get reviewed faster if you use Developer access

https://www.mediawiki.org/wiki/Developer_access

to put it directly into the Git source code repository:

https://www.mediawiki.org/wiki/Git/Workflow
Comment 2 Brad Jorsch 2012-05-07 15:00:41 UTC
(In reply to comment #1)
> Thanks for the patch, Brad.  It'll get reviewed faster if you use Developer
> access

I wasn't sure if I should do that with a patch I wasn't sure was right (re the potential filesort issue). I'll do that in the future.
Comment 3 Max Semenik 2012-05-10 16:27:13 UTC
The first part of your patch makes the query unindexed:
* Was: iwl_prefix = const, ORDER BY iwl_title, iwl_from - covered by index iwl_prefix_title_from
* Your patch: iwl_prefix = const, ORDER BY iwl_from, iwl_title - no suitable index.
Comment 4 Max Semenik 2012-05-10 16:43:02 UTC
Note: although the number of items with the same iwl_prefix isn't very high (in the hundreds), due to the size of every element filesort is still possible. CCing Asher who has the final word in the field of DB performance.
Comment 5 Brad Jorsch 2012-05-10 20:44:07 UTC
(In reply to comment #3)
> The first part of your patch makes the query unindexed:
> * Was: iwl_prefix = const, ORDER BY iwl_title, iwl_from - covered by index
> iwl_prefix_title_from
> * Your patch: iwl_prefix = const, ORDER BY iwl_from, iwl_title - no suitable
> index.

I was afraid of something like that.
Comment 6 Asher Feldman 2012-05-10 23:51:00 UTC
The langlink portion looks desirable.
Comment 7 Brad Jorsch 2012-08-03 15:42:39 UTC
(In reply to comment #3)
> The first part of your patch makes the query unindexed:
> * Was: iwl_prefix = const, ORDER BY iwl_title, iwl_from - covered by index
> iwl_prefix_title_from
> * Your patch: iwl_prefix = const, ORDER BY iwl_from, iwl_title - no suitable
> index.

I forgot to update the bug until now, but I talked to Roan at the Berlin Hackathon two months ago and we concluded that the query in my patch is correct, and we should have a iwl_prefix_from_title index to cover it (keep iwl_prefix_title_from, ApiQueryIWBacklinks uses that).

The other possibility we considered was changing the processing of the continue parameter so ORDER BY iwl_title, iwl_from would be correct for that case. The problem there is inconsistency: if the module needs to return (prefix,title,from) triplets of ('de','foo',1), ('de','bar',1), and ('de','bar',2), and the API limit is 1, it would return page_id 1 with link de:bar, then page_id 2 with link de:bar, then page_id 1 again with link de:foo. All other API modules that return results by page like this would return page_id 1 with link de:bar, then page_id 1 with link de:foo, and only then page_id 2. Some clients may depend on this behavior.
Comment 8 Brad Jorsch 2013-01-11 01:33:50 UTC
I finally got around to putting this in Gerrit:
Gerrit change #43389
Comment 9 Brad Jorsch 2013-03-14 19:53:59 UTC
*** Bug 46136 has been marked as a duplicate of this bug. ***
Comment 10 Brad Jorsch 2013-04-04 13:44:29 UTC
Change merged. It should be deployed to WMF wikis with 1.22wmf2, see https://www.mediawiki.org/wiki/MediaWiki_1.22/Roadmap for the schedule.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links