Last modified: 2014-02-05 17:56:12 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T62854, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 60854 - CirrusSearch: Can't reindex commons....
CirrusSearch: Can't reindex commons....
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
CirrusSearch (Other open bugs)
unspecified
All All
: High normal (vote)
: ---
Assigned To: Nik Everett
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-02-05 00:55 UTC by Nik Everett
Modified: 2014-02-05 17:56 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Nik Everett 2014-02-05 00:55:17 UTC
When I reindex commons, it crashes:
mwscript extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php --wiki $wiki --reindexAndRemoveOk --indexIdentifier now --reindexProcesses 4 | tee ~/cirrus_log/$wiki.reindex.log
file index...
        Setting index identifier...commonswiki_file_1391550673
        Creating index...ok
        Validating analyzers...ok
        Validating mappings...
                Validating mapping for page type...different...corrected
        Validating aliases...
                Validating file alias...is taken...
                Reindexing...
                        [0] Starting child process reindex
                        [1] Starting child process reindex
                        [2] Starting child process reindex
                        [3] Starting child process reindex
                        [0] About to reindex 5044794 documents
                        [1] About to reindex 5045574 documents
                        [3] About to reindex 5045006 documents
                        [2] About to reindex 5043744 documents
....
                        [0] Reindexed 130000/5044794 documents at 695/second
                        [2] Reindexed 130000/5043744 documents at 689/second
                        [1] Reindexed 140000/5045574 documents at 719/second
                        [0] Reindexed 140000/5044794 documents at 695/second

Warning: Search backend error during reindex.  Error message is:  Error in one or more bulk request actions:

create: /commonswiki_file_1391550673/page/391846 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8] [page][391846]: document already exists]
create: /commonswiki_file_1391550673/page/392553 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8] [page][392553]: document already exists]
create: /commonswiki_file_1391550673/page/392692 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8] [page][392692]: document already exists]
create: /commonswiki_file_1391550673/page/391163 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8] [page][391163]: document already exists]
create: /commonswiki_file_1391550673/page/391288 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8] [page][391288]: document already exists]
create: /commonswiki_file_1391550673/page/392135 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8 in /usr/local/apache/common-local/php-1.23wmf12/includes/de
bug/Debug.php on line 301

From there on out the [3] process is dead and doesn't log anything else.

We should figure out why rather than just catch the exception, log it, and move on.  If we can figure out what causes this we can prevent it in the future.
Comment 1 Nik Everett 2014-02-05 00:56:12 UTC
Assigning to myself to look at in the morning.
Comment 2 Nik Everett 2014-02-05 14:12:10 UTC
I think this is caused by the document being updated while we're scrolling and us hitting it twice.  I'm not sure about that yet but I'm working up a commit that'll just insert the second copy on top of the old one.
Comment 3 Gerrit Notification Bot 2014-02-05 14:17:36 UTC
Change 111450 had a related patch set uploaded by Manybubbles:
Reindex is ok seeing same id twice

https://gerrit.wikimedia.org/r/111450
Comment 4 Gerrit Notification Bot 2014-02-05 17:54:38 UTC
Change 111450 merged by jenkins-bot:
Reindex is ok seeing same id twice

https://gerrit.wikimedia.org/r/111450

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links