Last modified: 2014-07-27 21:30:01 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T69131, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 67131 - Results order when using generator=search
Results order when using generator=search
Status: RESOLVED DUPLICATE of bug 14859
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
unspecified
All All
: Lowest enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
: 68515 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-06-26 13:28 UTC by Artur Bekasov
Modified: 2014-07-27 21:30 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Artur Bekasov 2014-06-26 13:28:38 UTC
I am trying to use search as a generator, and get summary of every article found, effectively replicating functionality of the Special:Search page. That's what I am doing:

http://en.wikipedia.org/w/api.php?action=query&generator=search&gsrsearch=vector%20space&prop=extracts&exintro&exlimit=10&exsentences=1&explaintext

It works nicely, apart from one thing: it appears that the results are sorted by title. It's a shame, because Lucene does a decent job at ranking the results, as you can see if you just return a list of results:

http://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=vector%20space&srprop=

So the only way around that I've came up with is to do two requests: query with list=search to get search results, and then query with prop=extracts to get summaries of the titles found previously. It seems to work, but you probably understand that it's not a very reliable/efficient/beautiful solution.

In the comment for another bug (https://bugzilla.wikimedia.org/show_bug.cgi?id=14859#c1) it has been explained that respecting the original order of titles is not feasible/desirable. Generators are basically a different way of providing a list of title, so I can see why you might not be keen on implementing that. However, I am still keen on opening this ticket, for two reasons:
  1. It seems like a quite basic use case.
  2. The problem makes generator=search useless for most people.

I am using the latest API version as installed on Wikipedia wiki.
Comment 1 Brad Jorsch 2014-06-26 15:06:03 UTC
As you already noted, the reasons provided in bug 14859 apply here too. This is a duplicate of that bug.


(In reply to Artur Bekasov from comment #0)
>   2. The problem makes generator=search useless for most people.

{{citation needed}} on "most". Especially with the improvements Cirrus is bringing, I see opportunity for plenty of use cases where people would want to use generator=search to find the unordered set of pages matching some query.

*** This bug has been marked as a duplicate of bug 14859 ***
Comment 2 Jörn Hees 2014-07-27 20:59:58 UTC
*** Bug 68515 has been marked as a duplicate of this bug. ***
Comment 3 Jörn Hees 2014-07-27 21:30:01 UTC
I actually disagree that this is simply a duplicate of bug 14859:
in bug 14859 the op already knows the order of titles he's querying for. Here search internally returns the order to the generator but that information is just destroyed and never given to the end user, causing us to need two queries instead of just one.

I don't see how not destroying that ordering information and returning it as an optional index list would have any significant performance impact.

If one is considered about re-ordering the up to 500 "pages" results one could even leave them in the order they are in atm [1], because with that optional index list from the generator input function one could actually resort the results on client side as mentioned in bug 14859. Additionally generator=search would not need 2 individual queries.

@Brad: Would be nice if you could reconsider your decision on this one and maybe reopen it.


[1]: (they are actually a result dict not list, so it's probably not wise to rely on their order anyhow)

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links