Last modified: 2014-03-24 17:28:42 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T50573, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 48573 - Search result on Commons does not contain search term
Search result on Commons does not contain search term
Status: RESOLVED WONTFIX
Product: Wikimedia
Classification: Unclassified
lucene-search-2 (Other open bugs)
wmf-deployment
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
cirrus-fixed
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-17 10:32 UTC by Scott Martin (http://enwp.org/user:scott)
Modified: 2014-03-24 17:28 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Scott Martin (http://enwp.org/user:scott) 2013-05-17 10:32:13 UTC
(CAUTION: EXPLICIT SEXUAL CONTENT IN LINKS BELOW)

Searching the Wikimedia Commons for the word "purpose":

http://commons.wikimedia.org/w/index.php?search=purpose&button=&title=Special%3ASearch

Returns this file as the first result:

http://commons.wikimedia.org/wiki/File:Black_genitalia.jpg

Investigation into the matter (http://commons.wikimedia.org/wiki/Commons:Village_pump#Bizarre_search_result) has revealed that the word "purpose" is not in the file's description, EXIF, or any backlinks.

A suggestion has been made that it may somehow be related to the word "pose", which several other files in the same series include in their name, such as this (also explicit):

http://commons.wikimedia.org/wiki/File:Pose.jpg

That makes me wonder if this is somehow related to bug 2511, from way back in 2005, in that it could relate to stemming, but that's pure guessing.
Comment 1 Andre Klapper 2013-05-17 11:03:36 UTC
Confirming.
Comment 2 Russavia 2013-05-18 14:00:47 UTC
The issue can be found at:

http://commons.wikimedia.org/wiki/Commons:Deletion_requests/Image:Black_genitalia.jpg

The DR stated the word "purpose" so it is including backlinks to the image in how the search is handled.

I must say it is quite funny and sad that a DR which was trying to delete the file is what is responsible for shooting it to the top of the search results :)
Comment 3 Russavia 2013-05-18 14:14:13 UTC
So that this does not occur in future, we should be making it so that deletion requests on Commons do not count for backlinks, etc.

The same issue occurs from Commons:Quality_images_candidates/Archives_December_2010#File:TortoiseshellCat.JPG where http://commons.wikimedia.org/wiki/File:PieCrust_masked.jpg is returned as the top result for a search for "result"

Ideally, project pages should not be included for such search results.

http://commons.wikimedia.org/wiki/Commons:Requests_for_comment/improving_search#A_little_bit_of_intelligence is looking to be a better and better solution to search
Comment 4 Peter James 2013-05-19 15:35:39 UTC
(In reply to comment #3)
> So that this does not occur in future, we should be making it so that
> deletion
> requests on Commons do not count for backlinks, etc.
> 
> The same issue occurs from
> Commons:Quality_images_candidates/Archives_December_2010#File:
> TortoiseshellCat.JPG
> where http://commons.wikimedia.org/wiki/File:PieCrust_masked.jpg is returned
> as
> the top result for a search for "result"
> 
> Ideally, project pages should not be included for such search results.
> 
> http://commons.wikimedia.org/wiki/Commons:Requests_for_comment/
> improving_search#A_little_bit_of_intelligence
> is looking to be a better and better solution to search

The "purpose" result was from Commons:Deletion requests/File:Geschändetehostie.jpg. I've changed the links to go to the deletion request pages instead of the files; I don't know how long it takes for search results to update. The "result" link is at User_talk:Jonathunder/1#Masking_of_a_pie.
Comment 5 Dan Garry 2014-02-08 02:39:50 UTC
This problem does not appear to exist in CirrusSearch, as the file in question was not returned in the first 2,000 results for the search query "purpose" on Commons.

As we're in the process of migrating from Lucene to CirrusSearch, I'm marking this as RESOLVED WONTFIX.
Comment 6 Peter James 2014-03-23 02:46:41 UTC
I changed the "purpose" link several months ago (comment 4 and https://commons.wikimedia.org/w/index.php?title=Commons:Deletion_requests/File:Gesch%C3%A4ndetehostie.jpg&diff=prev&oldid=96520537). The pie is still the top result for "result".
Comment 7 Scott Martin (http://enwp.org/user:scott) 2014-03-23 04:02:21 UTC
https://commons.wikimedia.org/w/index.php?search=result&title=Special%3ASearch&go=Go&uselang=en

Link for "result" search. I would say that this bug should remain open on that basis.
Comment 8 Andre Klapper 2014-03-23 13:28:55 UTC
This valid bug will not be fixed and we are against wasting time fixing it as the search infrastructure will be replaced soon.
That's why this report was closed as "WONTFIX" (will not fix). See comment 5.
Comment 9 Scott Martin (http://enwp.org/user:scott) 2014-03-23 20:11:37 UTC
Yes, I saw what Dan said (and I know very well what WONTFIX means). His comment seemed to imply to me that Commons was already using CirrusSearch and the bug still existed in it. If the bug exists in some other element of the infrastructure that will soon be replaced, then that's fine.
Comment 10 Dan Garry 2014-03-24 16:05:25 UTC
(In reply to Scott Martin from comment #9)
> Yes, I saw what Dan said (and I know very well what WONTFIX means). His
> comment seemed to imply to me that Commons was already using CirrusSearch
> and the bug still existed in it. If the bug exists in some other element of
> the infrastructure that will soon be replaced, then that's fine.

My apologies. I wasn't very clear in my original comment.

Commons is still using LuceneSearch as its default search. I can reproduce the "result" bug in comment 7 using the provided URL.

However, Commons has CirrusSearch enabled as a Beta Feature. This means that users can, on a user-by-user basis, switch over to using CirrusSearch instead by ticking the "New search" box in their Beta preferences. I unticked the box to reproduce the bug above. If I tick the box again, and use CirrusSearch instead, I am unable to reproduce this bug in CirrusSearch.

I'm WONTFIXing this because we're in the process of (gradually) rolling out CirrusSearch to be the default on all wikis, so since this bug is fixed in CirrusSearch our time is better spent improving CirrusSearch rather than fixing LuceneSearch bugs.

Some wikis, such as all Wikivoyages, are already using CirrusSearch as the default search engine [1]. While we iron out bugs, we've got most wikis set to have CirrusSearch as a Beta Feature, so if there are any game breaking bugs then people can turn off CirrusSearch really easily. A few unfortunate wikis still use LuceneSearch and don't have the Beta Feature enabled, though this is entirely due to technical reasons related to the server cluster that handles search not being able to support CirrusSearch being turned on everywhere yet.

Hopefully this helps explain the situation. Let me know if you have any questions.

[1]: It's still possible to use Lucene on these wikis by appending &srbackend=LuceneSearch to the end of a search URL, though I have no idea why the end user would actually wish to do that.
Comment 11 Scott Martin (http://enwp.org/user:scott) 2014-03-24 17:28:42 UTC
That's great. Thanks Dan.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links