Last modified: 2014-02-07 22:52:52 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T34655, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 32655 - Improving search for templates
Improving search for templates
Status: RESOLVED WONTFIX
Product: Wikimedia
Classification: Unclassified
lucene-search-2 (Other open bugs)
unspecified
All All
: Lowest enhancement with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-11-26 12:34 UTC by Andy Mabbett
Modified: 2014-02-07 22:52 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Andy Mabbett 2011-11-26 12:34:13 UTC
A search for, say, "{{Authority control}}" should be treated as a search for "Template:Authority control" (or at least return that template page ahead of other search results).

Similarly, a search for "{{Authority" should return all templates beginning "Template:Authority"
Comment 1 orenbochman 2012-01-19 11:56:40 UTC
Thanks for the suggestion. I am looking into smarter indexing of wikisource and will consider this feature in the design.

To give reliable results for such queries in the search engine:

1.  The indexer would need to carry out template expansions of MediaWiki source.
1.1 Access to all the templates.
1.2 Re-implimentation of magic words, parser function math and logic operators.
1.3 Reindexing all dependent pages whenever a template changes.
2.  The parser then have to further analyse the wiki source to tokenize templates. 
3.  What is a good information architecture  for storing Template annotations would be index in a correct order without creating gaps in surrounding text. 
3.1 Use 0 position increment for wiki text, or
3.2 Template information could be stored in a separate field, or
3.3 GATE type multi_document source,unified diff of wiki-source & HTML-output
3.4 Replace document term position vector with a DAWG term tree, or
3   Template annotation tokens would need to be amiable to prefix queries.
4.  a Lucene query to retrieve template (exact|prefix|category).
5.  a sensible ranking mechanism for such results.
6.  a UI modification to allow exact source search.
Comment 2 Robert Stojnic 2012-01-19 12:05:08 UTC
This seems like a massive effort that will need lots of maintenance. 

Why not store expanded wikitext somewhere in the database (or reference to some of the caching layers) and then query that instead of normal wikitext via OAI?
Comment 3 Andre Klapper 2013-03-26 11:25:22 UTC
[Merging "MediaWiki extensions/Lucene Search" into "Wikimedia/lucene-search2", see bug 46542. You can filter bugmail for: search-component-merge-20130326 ]
Comment 4 Dan Garry 2014-02-07 22:52:52 UTC
Someone who types "{{Authority control}}" as a search query is likely to be a very experienced user. They'll know what the template namespace is. I don't think it really makes sense to add specific, custom functionality for this when we already have an advanced search option to search template namespace.

Also, Lucene is reaching the end of its life and I'm the process of clearing out old bugs like this. Any future feature requests for search should be filed in MediaWiki Extensions -> CirrusSearch.

Changing to RESOLVED WONTFIX.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links