Last modified: 2014-06-26 21:24:35 UTC
We're having trouble reindexing meta because we're hitting a page with an external link that contains invalid utf-8: [2014-06-26 18:43:29,960][DEBUG][action.bulk ] [elastic1018] [metawiki_general_1403807864][5] failed to execute bulk item (index) index {[metawiki_general_1403807864][page][661035], source[{"namespace":2,"namespace_text":"User","title":"COIBot/Local/selftrans.narod.ru","timestamp":"2011-10-11T04:19:40Z","category":["Pages where template include size is exceeded","Noindexed pages","COIBot Local Reports"],"external_link":["//wikipediatools.appspot.com/linksearch.jsp?set=top20&link=selftrans.narod.ru","//wikipediatools.appspot.com/linksearch.jsp?set=top40&link=selftrans.narod.ru","//wikipediatools.appspot.com/linksearch.jsp?set=major&link=selftrans.narod.ru","http://www.google.com/search?num=10&hl=en&rls=en&q=selftrans.narod.ru","//www.google.com/search?num=100?h1=en&rls=en&q=selftrans.narod.ru+site:en.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=selftrans.narod.ru+site:fr.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=selftrans.narod.ru+site:de.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=selftrans.narod.ru+site:meta.wikimedia.org","http://siteexplorer.search.yahoo.com/advsearch?p=selftrans.narod.ru&bwm=i&bwmf=d&bwms=p","//toolserver.org/~erwin85/xwiki.php?report=User:COIBot/LinkReports/selftrans.narod.ru&forcelive=1","//toolserver.org/~erwin85/xwiki.php?report=User:COIBot/Local/selftrans.narod.ru&forcelive=1","//tools.wmflabs.org/searchsbl/?url=selftrans.narod.ru","http://whois.domaintools.com/selftrans.narod.ru","http://www.aboutus.org/selftrans.narod.ru","http://www.malwaredomainlist.com/mdl.php?search=selftrans.narod.ru&colsearch=Domain&quantity=50","http://www.alexa.com/data/details/main?url=selftrans.narod.ru","http://213.180.199.13","//wikipediatools.appspot.com/linksearch.jsp?set=top20&link=213.180.199.13","//wikipediatools.appspot.com/linksearch.jsp?set=top40&link=213.180.199.13","//wikipediatools.appspot.com/linksearch.jsp?set=major&link=213.180.199.13","http://www.google.com/search?num=10&hl=en&rls=en&q=213.180.199.13","//www.google.com/search?num=100?h1=en&rls=en&q=213.180.199.13+site:en.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=213.180.199.13+site:fr.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=213.180.199.13+site:de.wikipedia.org","//www.google.com/search?num=100&hl=en&rls=en&q=213.180.199.13+site:meta.wikimedia.org","http://siteexplorer.search.yahoo.com/advsearch?p=213.180.199.13&bwm=i&bwmf=d&bwms=p","//tools.wmflabs.org/searchsbl/?url=213.180.199.13","http://whois.domaintools.com/213.180.199.13","http://www.aboutus.org/213.180.199.13","http://www.malwaredomainlist.com/mdl.php?search=213.180.199.13&colsearch=Domain&quantity=50","http://www.alexa.com/data/details/main?url=213.180.199.13","http://uk.wikipedia.org/wiki/Mediawiki:Spam-whitelist","http://www.google.com/search?q=%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3%83%C2%82%C3%82%C2%82%C3%83%C2%83%C3%82%C2%82%C3%83%C2%82%C3%82%C2%83%C3%83%C2%83%C3%82%C2%83%C3 ... java.lang.IllegalArgumentException: Document contains at least one immense term in field="external_link" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[68 74 74 70 3a 2f 2f 77 77 77 2e 67 6f 6f 67 6c 65 2e 63 6f 6d 2f 73 65 61 72 63 68 3f 71]...' I'm not sure if this is a new feature of 1.2.1 or what.
index: /metawiki_general_1403807864/page/661035 caused IllegalArgumentException[Document contains at least one immense term in field="external_link" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[68 74 74 70 3a 2f 2f 77 77 77 2e 67 6f 6f 67 6c 65 2e 63 6f 6d 2f 73 65 61 72 63 68 3f 71]...']
Looks like there are more such issues: cirrus_log/arwikisource.reindex.log:Warning: Search backend error during reindex. Error message is: No enabled connection [Called from CirrusSearch\UpdateOneSearchIndexConfig::reindexInternal in /usr/local/apache/common-local/php-1.24wmf10/extensions/CirrusSearch/maintenance/updateOneSearchIndexConfig.php at line 794] in /usr/local/apache/common-local/php-1.24wmf10/includes/debug/Debug.php on line 303 cirrus_log/commonswiki.reindex.log:Warning: Search backend error during sending 10 documents to the file index after 49. Regex syntax error: failed to execute script [Called from CirrusSearch\ElasticsearchIntermediary::failure in /usr/local/apache/common-local/php-1.24wmf10/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php at line 98] in /usr/local/apache/common-local/php-1.24wmf10/includes/debug/Debug.php on line 303 cirrus_log/commonswiki.reindex.log:Warning: Search backend error during sending 8 documents to the file index after 89. Regex syntax error: failed to execute script [Called from CirrusSearch\ElasticsearchIntermediary::failure in /usr/local/apache/common-local/php-1.24wmf10/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php at line 98] in /usr/local/apache/common-local/php-1.24wmf10/includes/debug/Debug.php on line 303 cirrus_log/ltwiktionary.reindex.log:Warning: Search backend error during sending 1 documents to the general index after 75. Regex syntax error: failed to execute script [Called from CirrusSearch\ElasticsearchIntermediary::failure in /usr/local/apache/common-local/php-1.24wmf10/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php at line 98] in /usr/local/apache/common-local/php-1.24wmf10/includes/debug/Debug.php on line 303 cirrus_log/metawiki.reindex.log:Warning: Search backend error during reindex. Error message is: Error in one or more bulk request actions: Though, it isn't clear what the error is due to the broken syntax checker that we just fixed.
OK! Those error messages - the ones about regex syntax errors will stop masking their real errors tonight. They are caused by update errors. Simple enough to fix, and I'll put that in the same patch that fixes meta's problem.
arwikisource is different - I'm not sure what is up with it. It errors out (every time) with Warning: Search backend error during reindex. Error message is: No enabled connection [Called from CirrusSearch\UpdateOneSearchIndexConfig::reindexInternal in /usr/local/apache/common-local/php-1.24wmf10/extensions/CirrusSearch/maintenance/updateOneSearchIndexConfig.php at line 794] in /usr/local/apache/common-local/php-1.24wmf10/includes/debug/Debug.php on line 303 That means it got multiple http failures.
Change 142404 had a related patch set uploaded by Manybubbles: Fix rare-ish errors https://gerrit.wikimedia.org/r/142404
Change 142404 merged by jenkins-bot: Fix rare-ish errors https://gerrit.wikimedia.org/r/142404
Change 142412 had a related patch set uploaded by Manybubbles: Fix rare-ish errors https://gerrit.wikimedia.org/r/142412
Change 142413 had a related patch set uploaded by Manybubbles: Fix rare-ish errors https://gerrit.wikimedia.org/r/142413
Change 142413 merged by jenkins-bot: Fix rare-ish errors https://gerrit.wikimedia.org/r/142413
Change 142412 merged by jenkins-bot: Fix rare-ish errors https://gerrit.wikimedia.org/r/142412