Last modified: 2014-05-02 11:48:56 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T66624, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 64624 - CirrusSearch: Very large excerpt returned in search results
CirrusSearch: Very large excerpt returned in search results
Status: VERIFIED FIXED
Product: MediaWiki extensions
Classification: Unclassified
CirrusSearch (Other open bugs)
unspecified
All All
: High normal (vote)
: ---
Assigned To: Nik Everett
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-04-29 20:56 UTC by Quiddity
Modified: 2014-05-02 11:48 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
screenshot of massive excerpt (273.59 KB, image/png)
2014-04-29 20:56 UTC, Quiddity
Details

Description Quiddity 2014-04-29 20:56:47 UTC
Created attachment 15243 [details]
screenshot of massive excerpt

I searched for "Flow browser" and got a few very large excerpts. See screenshot of a 15,901 character excerpt.
This link includes the screenshotted result, as of today: https://www.mediawiki.org/w/index.php?title=Special:Search&limit=10&offset=27&profile=default&search=flow+browser
A few of the other results, were also larger than the expected ~100 characters.
Comment 1 Nik Everett 2014-04-29 21:08:09 UTC
Thanks for filing this.  I'm almost 100% sure this is caused by a new component I just cut mediawiki.org over to.  I'll get a solution for this soon.
Comment 2 Nik Everett 2014-04-30 15:24:47 UTC
            // Now we have to pick a start and end offset, but there are
            // actually four cases that can happen given the above for start
            // offset and four for end. I'll show the start offset here
            // because the end is just the mirror image:
            //
            // Case 1:
            // --------+------[-------+-------]--+----------------------
            //        min          expand       max
            // Case 2:
            // ----[---+-------+--------+-------------------------------
            //        min   expand     max
            // Case 3:
            // --------+---[----+-------+---]---------------------------
            //        min    expand    max
            // Case 4:
            // --------+-------+---------+------------------------------
            //      expand    min       max
            //
            // Case 1 is "normal", there are no obstructions and we pick
            // the boundary by looking from expand to [, the max scan, and if
            // that doesn't find anything looking from expand to the ], and if
            // that doesn't find anything defaulting to expand.
            //
            // Case 2 is almost normal. We look from expand to min but if that
            // doesn't find anything we deem min a valid boundary. Min is
            // generally the beginning of the source or the end of the last
            // segment and therefore a valid boundary. The case where expand
            // is right on top of min is pretty much a variant of this case.
            //
            // Case 3 is like case 2 but in reverse. We look to [ as in case
            // 1 and if we don't find anything we look to max. If that doesn't
            // find anything then we declare max a valid boundary. Max is
            // generally the beginning of the first hit, so very likely a valid
            // boundary.
            //
            // Case 4 is different. We'd like to expand past min which isn't
            // allowed so instead we deem min the boundary and try to shift the
            // whole segment forward some to make up for it.
Comment 3 Nik Everett 2014-04-30 18:16:21 UTC
https://gerrit.wikimedia.org/r/#/c/130615/
Comment 4 Nik Everett 2014-05-02 11:48:56 UTC
Deployed the fix yesterday.  Looks to be fixed now.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links