Last modified: 2014-02-05 17:56:12 UTC
When I reindex commons, it crashes: mwscript extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php --wiki $wiki --reindexAndRemoveOk --indexIdentifier now --reindexProcesses 4 | tee ~/cirrus_log/$wiki.reindex.log file index... Setting index identifier...commonswiki_file_1391550673 Creating index...ok Validating analyzers...ok Validating mappings... Validating mapping for page type...different...corrected Validating aliases... Validating file alias...is taken... Reindexing... [0] Starting child process reindex [1] Starting child process reindex [2] Starting child process reindex [3] Starting child process reindex [0] About to reindex 5044794 documents [1] About to reindex 5045574 documents [3] About to reindex 5045006 documents [2] About to reindex 5043744 documents .... [0] Reindexed 130000/5044794 documents at 695/second [2] Reindexed 130000/5043744 documents at 689/second [1] Reindexed 140000/5045574 documents at 719/second [0] Reindexed 140000/5044794 documents at 695/second Warning: Search backend error during reindex. Error message is: Error in one or more bulk request actions: create: /commonswiki_file_1391550673/page/391846 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8] [page][391846]: document already exists] create: /commonswiki_file_1391550673/page/392553 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8] [page][392553]: document already exists] create: /commonswiki_file_1391550673/page/392692 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8] [page][392692]: document already exists] create: /commonswiki_file_1391550673/page/391163 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8] [page][391163]: document already exists] create: /commonswiki_file_1391550673/page/391288 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8] [page][391288]: document already exists] create: /commonswiki_file_1391550673/page/392135 caused DocumentAlreadyExistsException[[commonswiki_file_1391550673][8 in /usr/local/apache/common-local/php-1.23wmf12/includes/de bug/Debug.php on line 301 From there on out the [3] process is dead and doesn't log anything else. We should figure out why rather than just catch the exception, log it, and move on. If we can figure out what causes this we can prevent it in the future.
Assigning to myself to look at in the morning.
I think this is caused by the document being updated while we're scrolling and us hitting it twice. I'm not sure about that yet but I'm working up a commit that'll just insert the second copy on top of the old one.
Change 111450 had a related patch set uploaded by Manybubbles: Reindex is ok seeing same id twice https://gerrit.wikimedia.org/r/111450
Change 111450 merged by jenkins-bot: Reindex is ok seeing same id twice https://gerrit.wikimedia.org/r/111450