Last modified: 2011-11-09 17:24:58 UTC
Compare https://de.wikipedia.org/w/api.php?action=query&list=search&srsearch=wikiproject+spam and https://de.wikipedia.org/w/index.php?title=Spezial%3ASuche&search=wikiproject+spam&go=Go : the API search returns zero results (promising three) while the GUI search returns two. There's also a spurious sroffset query continuation (not sure if this is a separate bug). I'm guessing this is a WMF specific issue, as it works exactly as expected on en.wp: https://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=wikiproject+spam https://en.wikipedia.org/w/index.php?title=Special%3ASearch&search=wikiproject+spam&go=Go.
Damned odd, but I can confirm the results. API is giving: <?xml version="1.0"?> <api> <query> <searchinfo totalhits="3" suggestion="wiki project spam" /> <search /> </query> <query-continue> <search sroffset="10" /> </query-continue> </api> whereas the search UI page returns these two results when search namespace 0: * https://de.wikipedia.org/wiki/OpenStreetMap * https://de.wikipedia.org/wiki/Kritik_an_Wikipedia Maybe there's a bogus result also in there and the UI and API search front-ends are handling it differently (UI by skipping one item, API by breaking on them all)?
Probably needs some lower-level debugging seeing the actual results from the Lucene server. Actual looping mechanisms in ApiQuerySearch and SpecialSearch look similar (while loop around SearchResultSet::next()) and I _think_ shouldn't tromp on each other.
curl http://search6:8123/search/dewiki/wikiproject+spam?version=2 3 0.03831976 0 Benutzerin_Diskussion%3AMrsMyer%2FArchiv%2F2008 0.02220609 0 OpenStreetMap 0.018330451 0 Kritik_an_Wikipedia Thus it seems the first result is a false positive, since it is from a wrong namespace. This page is at it's latest version (2011-10-23T00:46:30Z) but is wrongly put into main namespace (the format above is: score namespace pagename). I believe this is because the lucene backend doesn't know about feminine namespace names as they are not listed in XML dump header files.
I think the immediate cause for the different search results is line 132 in http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/api/ApiQuerySearch.php?view=markup : $result should be advanced before continuing, otherwise the API will just loop $limit times in vain. Add $result = $matches->next(); before that "continue" and the API and Special:Search should again return the same results.
(In reply to comment #4) > I think the immediate cause for the different search results is line 132 in > http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/api/ApiQuerySearch.php?view=markup > : $result should be advanced before continuing, otherwise the API will just > loop $limit times in vain. > > Add > > $result = $matches->next(); > > before that "continue" and the API and Special:Search should again return the > same results. Well spotted. Fixed in r102537, deployment underway.
(In reply to comment #5) > Well spotted. Fixed in r102537, deployment underway. Deployed now, and https://de.wikipedia.org/w/api.php?action=query&list=search&srsearch=wikiproject+spam works (other than the fact that totalhits still reports 3, because the user talk page with the feminine namespace is counted but not shown). Yay!