Last modified: 2014-10-31 14:36:30 UTC
In order to keep load down on the search cluster accelerated regex searches are only allowed to recheck a limited number of documents (10,000 right now). Right now when that limit is reached all subsequent documents are considered not to match and Cirrus doesn't signal the user at all that this happened. This means that results are less reliable. OTOH this should only happen if your regex can't be accelerated down to a small subset of the wiki which _should_ be reasonably rare. It'd happen if the regex actually does match more then the recheck limit or if it is specific but the trigram that we're able to extract from it still matches too many documents. Example: insource:/ {{/ will match a ton of pages and under report the number insource:/ {{..ca/ will match fewer pages but the only trigram that can be extracted from (" {{") is still on too many pages The plan is to allow the recheck code to signal back to cirrus that it gave up so it can let the user know that the results may not be consistent and it can tell them how to fix their regex. Unfortunately that first level of signalling requires Elasticsearch 1.4 which isn't quite released yet.