Last modified: 2014-06-12 21:18:38 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T63669, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 61669 - relevant search result excerpts: prefer first sentence of article
relevant search result excerpts: prefer first sentence of article
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
CirrusSearch (Other open bugs)
unspecified
All All
: High normal (vote)
: ---
Assigned To: Nik Everett
https://en.wikipedia.org/wiki/Cathari...
Experimental_Highlighter
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-02-20 16:57 UTC by Sumana Harihareswara
Modified: 2014-06-12 21:18 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Sumana Harihareswara 2014-02-20 16:57:07 UTC
I'm on English Wikipedia and I've turned on "new search" in the beta features.

I used the search box to search for "mackinnon":

https://en.wikipedia.org/w/index.php?search=mackinnon&title=Special%3ASearch&fulltext=1

The excerpt that shows up on the search results page now, with CirrusSearch, comes from citations 5 and 6 in the article on Catharine MacKinnon:

"Highly Cited Author - Catharine A. MacKinnon ^ Catharine MacKinnon 2005 Fellow of Stanford's Center for"

This feels like a less optimal result. I think we should prefer the first sentence in an article, maybe especially for articles that are biographies. For instance, the body of the article on Catharine MacKinnon starts:

"Catharine Alice MacKinnon (born October 7, 1946) is an American feminist, scholar, lawyer, teacher and activist."

Under the old search, that's what shows up on the search results page, albeit with some extraneous spaces around the commas. This is more relevant to a user who is trying to find a particular person's biography. In general, I suspect that the first sentence of a wiki page is more likely to help a searcher gauge relevance than is an excerpt from footnotes.
Comment 1 Chad H. 2014-02-24 21:26:36 UTC
This is a good idea. The old search boosted the lead section of an article. We should do this too in Cirrus. It'll lead to better results and better snippets.
Comment 2 Nik Everett 2014-03-07 20:03:20 UTC
Takededed.
Comment 3 Nik Everett 2014-04-11 01:16:59 UTC
Swapping out upstream/Elasticsearch_Open_Bug with Experimental_Highlighter because it supports boosting early parts of the article when picking the snippet.
Comment 4 Gerrit Notification Bot 2014-06-12 21:17:50 UTC
Change 137521 merged by jenkins-bot:
Boost results that contain hits in the opening

https://gerrit.wikimedia.org/r/137521

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links