Last modified: 2014-01-02 13:32:42 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T61205, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 59205 - Weblinks not found by CirrusSearch on Wikimedia Commons
Weblinks not found by CirrusSearch on Wikimedia Commons
Status: RESOLVED DUPLICATE of bug 52905
Product: MediaWiki extensions
Classification: Unclassified
CirrusSearch (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
https://commons.wikimedia.org/w/index...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-01-02 12:13 UTC by Raimond Spekking
Modified: 2014-01-02 13:32 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Comment 1 Nik Everett 2014-01-02 13:32:42 UTC
For those following along at home:
1.  Make sure you disable the "New Search" BetaFeature or else both searches use Cirrus.
2.  I had more luck reproducing the behavior by search "Everything":
CirrusSearch: https://commons.wikimedia.org/w/index.php?title=Special:Search&search=http%3A%2F%2Fwww.niag-online.de%2Fdownloads%2F2012-11-16_niag-kleve_sb58_nov2012.pdf&fulltext=Search&profile=all&redirs=1&srbackend=CirrusSearch
LuceneSEarch: https://commons.wikimedia.org/w/index.php?title=Special:Search&search=http%3A%2F%2Fwww.niag-online.de%2Fdownloads%2F2012-11-16_niag-kleve_sb58_nov2012.pdf&fulltext=Search&profile=all&redirs=1


LuceneSearch finds the url because it appears inside the wikitext.  CirrusSearch doesn't because the url doesn't appear in the page _text_.  The url appears as the href attribute of an anchor tag:
<a class="external text" href="http://www.niag-online.de/downloads/niag-kleve_lile_sb58.pdf" rel="nofollow">

Because CirrusSearch renders the wikitext to HTML then removes all the tags it only sees the text of the link.

I'm going to mark this bug a duplicate of the older bug we've file for the problem but raise the priority of the other bug.

*** This bug has been marked as a duplicate of bug 52905 ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links