Last modified: 2011-09-15 12:45:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T32904, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 30904 - SphinxSearch 0.8+; $wgEnableSphinxPrefixSearch and the quality of suggested terms
SphinxSearch 0.8+; $wgEnableSphinxPrefixSearch and the quality of suggested t...
Status: RESOLVED WONTFIX
Product: MediaWiki extensions
Classification: Unclassified
SphinxSearch (Other open bugs)
unspecified
All All
: Unprioritized normal (vote)
: ---
Assigned To: Svemir Brkic
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-09-14 19:47 UTC by MWJames
Modified: 2011-09-15 12:45 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description MWJames 2011-09-14 19:47:59 UTC
In case $wgEnableMWSuggest = true; and $wgEnableSphinxPrefixSearch  = true; are enabled, search suggestions are displayed as expected but as soon one uses  prefix like Template: or Mediawiki: the search suggestions stops after one puts ":" into the search string.

It seems that only the term that succeed ":" is recognized. For example  "Template:Type", SphinxSearch only searches with @page_title:^Template* but it should search for @page_title:^Template:Type* instead.
Comment 1 Svemir Brkic 2011-09-15 03:02:54 UTC
Works as expected for me in MW 1.18 - while I type "Templ..." it is only showing suggestions from the main namespace, such as "Temple". Once I type "Template:" it shows nothing because at this point it is considering that as a namespace filter without any specific query. As soon as I get to "Template:t" it suggests "Template:Test" for example.

Perhaps some other options are getting in the way? I am using Vector skin, but there are issues with "Vector" extension. If you are not including Vector extension, what version of MW are you on?
Comment 2 MWJames 2011-09-15 07:56:57 UTC
We deactivated the Extension:Vector in our testing environment. 

[1] For term Mediawiki the search works with @page_title: ^Mediawik* 

[2] For term Mediawiki:Caven the search is executed with @page_title: ^Caven* 

[3] But should the search not include @page_title: ^Mediawiki:Caven*

The SphinxSearch command shows following output for the search term Mediawiki:Cavendish

[Thu Sep 15 16:49:52.941 2011] 0.002 sec [ext2/2/ext 0 (0,10)] [*] @page_title:
^Medi*
[Thu Sep 15 16:49:54.596 2011] 0.002 sec [ext2/2/ext 0 (0,10)] [*] @page_title:
^Mediawik*
[Thu Sep 15 16:50:06.540 2011] 0.002 sec [ext2/2/ext 0 (0,10)] [*] @page_title:
^Caven*
[Thu Sep 15 16:50:10.721 2011] 0.037 sec [ext2/2/ext 2 (0,10)] [*] @page_title:
^Cavendish*
Comment 3 MWJames 2011-09-15 08:03:54 UTC
Just for completing the test cycle.

When the term Mediawiki:Cavendish is directly searched in Special:Search the sphinx control output shows:

[Thu Sep 15 16:58:42.250 2011] 0.002 sec [ext2/1/ext 26 (0,20)] [*] Cavendish
[Thu Sep 15 16:59:09.138 2011] 0.002 sec [ext2/2/ext 2 (0,10)] [*] @page_title:
^Cavendish*

Meaning that the namespace Mediawiki: is not at all present in the sphinx search string.
Comment 4 Svemir Brkic 2011-09-15 11:19:19 UTC
This is because namespaces are not in the titles of MediaWiki articles. They are displayed that way, but if the namespace is properly setup, the page_title field only has the part after the colon. Sphinx extension finds the namespace in the search string, takes it out, and applies the namespace as a filter instead:

 $cl->SetFilter( 'page_namespace', $this->namespaces );

Now, is there some actual problem with the suggestions as they show up (or not) as you type? Are there articles you have in the database that do not show up? What are the article titles, namespaces, and what do you type?
Comment 5 MWJames 2011-09-15 11:51:09 UTC
We might figured that the problem is not a programming but more likely connected to customizing of the sphinx.conf. Depending on combination of settings such as;

* min_word_len	   = 5
* min_stemming_len = 5
* min_infix_len    = 1
* enable_star      = 1

Sphinx will only return suggestions when those parameters met. For example min_word_len is set 5 than only suggestions for terms larger than 5 will be made.

Besides this, it is still not really clear why for example the suggestions for 

* Render returns 24 results but the suggestions for 
* Ren only return 1

We would expect Ren the be in a larger pool of suggestions than Render.

[Thu Sep 15 20:33:16.643 2011] 0.002 sec [ext2/2/ext 24 (0,10)] [*] @page_title:
 ^Render\/*
[Thu Sep 15 20:38:56.313 2011] 0.002 sec [ext2/2/ext 1 (0,10)] [*] @page_title:
^Ren*

For time being we might close this bug, and note that parameters in sphinx.conf impact the quality of suggested terms according to the parameters set when $wgEnableSphinxPrefixSearch is true.
Comment 6 Svemir Brkic 2011-09-15 12:45:20 UTC
You are correct - these are a result of Sphinx engine stemming and other indexing parameters. If this behavior is not desired, do not use $wgEnableSphinxPrefixSearch. Since suggestions search only titles, there is probably little benefit in using Sphinx for those in most cases - unless you actually want the "grammar treatment"

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links