Last modified: 2014-11-20 22:07:49 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T74894, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 72894 - RegEx-style Cirrus searches are ignored on en.wikipedia
RegEx-style Cirrus searches are ignored on en.wikipedia
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
CirrusSearch (Other open bugs)
unspecified
All All
: High normal (vote)
: ---
Assigned To: Nik Everett
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-11-03 00:54 UTC by SpontaneousGrumbler
Modified: 2014-11-20 22:07 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description SpontaneousGrumbler 2014-11-03 00:54:41 UTC
A search for "wyoming"insource:/wyoming/ should find only lower-case examples, but over 45,000 articles, mostly capitalized, are returned. This worked 2 days ago.
Comment 1 Chad H. 2014-11-03 03:29:36 UTC
Yes, they're disabled for the time being as they were causing downtime.
Comment 2 SpontaneousGrumbler 2014-11-03 05:50:00 UTC
And how were the users of WP notified that the feature was turned off? How long is "the time being"?
Comment 4 Nik Everett 2014-11-03 13:38:06 UTC
Hi Grumbler!

I sent an email to  wikitech-ambassadors@lists.wikimedia.org on October 31st at 11:30 US Eastern time.  Its how I've done the bulk of the Cirrus communication at this point and I thought it was the right way to announce it.

I should have updated a recent conversation on en:wp's Village Pump and did when someone pinged me.  I should have remembered to do it earlier but I was trying to fix the problem.

As far as how long the time being is - I'm working on it now.  I have a hacked together solution the demonstrably works but that is only the first step.  The remaining steps are mostly these:
1.  Work with upstream (Lucene) to get a version of the hack that'd be acceptable for them to merge.
2.  Backport those changes to Lucene to the plugins we use for regex search and highlighting.
3.  Release new versions of those.
4.  Deploy them to our search cluster.
5.  Reenable the feature.


You can sort of fudge on 1 and just backport something more hacky but that is somewhat more dangerous from a stability standpoint.  But that is life and we'll do it if upstream drags their feet.  So far they've been reasonably responsive though.

Estimating a time from that is tricky.  A week from today?
Comment 5 Nik Everett 2014-11-04 13:07:32 UTC
Quick update:
Step 1 is moving along pretty quickly.  I'm communicating closely with a Lucene committer on getting a patch merged for this.  It feels like we'll get something merged today which means its worth waiting for that before moving on to step 2.

Also there is a step 2.5:  Update Cirrus to catch the new error message from step #2 and produce a useful message for the user.
Comment 6 SpontaneousGrumbler 2014-11-10 15:22:55 UTC
CirrusSearch is set to be rolled out to en.wikipedia on November 19. Please tell me that when that date was set they knew about this outage and were confident that this will be fixed in time for the The Grand Opening.
Comment 7 Chad H. 2014-11-10 15:24:44 UTC
(In reply to SpontaneousGrumbler from comment #6)
> CirrusSearch is set to be rolled out to en.wikipedia on November 19. Please
> tell me that when that date was set they knew about this outage and were
> confident that this will be fixed in time for the The Grand Opening.

I would hope that Nik and I knew about the outage when we picked that date.
Comment 8 Nik Everett 2014-11-10 15:43:20 UTC
(In reply to SpontaneousGrumbler from comment #6)
> CirrusSearch is set to be rolled out to en.wikipedia on November 19. Please
> tell me that when that date was set they knew about this outage and were
> confident that this will be fixed in time for the The Grand Opening.

Yup.

Here is the status:
The fix for the cause of the outage is live in beta.  When I tried it on Saturday I found an error where sometimes the right error message isn't shown when the regex is too complex to use.  I've *just* finished the fix for that.

The plan right now is get that to beta today and validate it.

We'll deploy the fix to production on Wednesday.  Today or Tuesday would have been better but Tuesday is a US holiday and we'll have less people on hand in the unlikely event that something goes wrong.

That puts us reenabling regex search on Thursday.  Its long than we'd thought/hoped.
Comment 9 SpontaneousGrumbler 2014-11-10 17:45:09 UTC
Thanks for the update.
Comment 10 Nik Everett 2014-11-13 14:26:26 UTC
Plugins deployed.  We'll be pushing code to reenable the searches in our general window which starts in an hour and a half.
Comment 11 SpontaneousGrumbler 2014-11-13 16:21:22 UTC
Thanks. It appears to be working now.
Comment 12 Nik Everett 2014-11-13 16:26:59 UTC
Hey, glad its working for you.  I keep getting:
An error has occurred while searching: Too many regular expression searches currently running. Please try again later.
which is a pain.  I think something is up with the counter because I totally don't see that many regex searches.
Comment 13 SpontaneousGrumbler 2014-11-13 21:56:37 UTC
Yes, well, it worked for one search. Ever since then, I keep getting the same error message, "Too many regular expression searches". Yes, something is wrong; failing to decrement the counter seems a likely cause.
Comment 14 Nik Everett 2014-11-13 22:15:00 UTC
Yeah.  Something.  I'll be able to spend some time with it in the morning.  I've he a suspicion that that counter is lying for a while now.  Its actually the same counter that we for all kinds of stuff and its pretty hard *not* to decrement it.

I'm at least glad it doesn't just hate me.
Comment 15 Nik Everett 2014-11-14 20:01:01 UTC
OK!  I found the problem.  Our pool counter work differently then how I thought it did.  I've prepared a patch to deploy and we'll sync it out during the Monday morning deploy.

https://gerrit.wikimedia.org/r/#/q/I7586162cfb32ddbe460a25c956c845f2f4a49b0f,n,z
Comment 16 SpontaneousGrumbler 2014-11-15 00:16:38 UTC
Good news!
Comment 17 SpontaneousGrumbler 2014-11-20 21:19:54 UTC
I would say it has been much better for the last few days. If you want to mark this bug as "fixed", I won't disagree.
Comment 18 Chad H. 2014-11-20 22:07:49 UTC
Between Nik's fixes and the fact that we segmented enwiki's PoolCounter traffic to its own key I think we're in a way better spot than before.

Resolving FIXED. Please reopen if this becomes a problem again.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links