Last modified: 2014-11-07 06:43:47 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T73971, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 71971 - site_tests test_image_usage_in_redirect timeout error after maximum retries
site_tests test_image_usage_in_redirect timeout error after maximum retries
Status: NEW
Product: Pywikibot
Classification: Unclassified
tests (Other open bugs)
core-(2.0)
All All
: High normal
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-10-12 03:28 UTC by John Mark Vandenberg
Modified: 2014-11-07 06:43 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description John Mark Vandenberg 2014-10-12 03:28:47 UTC
The recent travis builds are regularly failing on site_tests.testImageusage with a timeout error only when run against en.wp.  Travis runs this test for each checking against 2.6, 2.7.8 and 3.3.5 at the moment.  It doesnt happen every time, and it hasnt happend on 2.6 yet for some reason.

Note this bug is _not_ bug 60214, which is about content & caching assumptions in the test case, which very affect mostly re-runs of the test suite on a warm cache, and only very rarely occur on a cold cache.


Python 2.7.8 log (https://travis-ci.org/wikimedia/pywikibot-core/jobs/37633107)

ERROR: testImageusage (tests.site_tests.TestSiteObject)
Test the site.imageusage() method
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/wikimedia/pywikibot-core/tests/site_tests.py", line 616, in testImageusage
    for using in mysite.imageusage(imagepage, filterredir=True, total=5):
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 971, in __iter__
    self.data = self.request.submit()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 764, in submit
    self._data = super(CachedRequest, self).submit()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 535, in submit
    self.wait()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 648, in wait
    raise TimeoutError("Maximum retries attempted without success.")
TimeoutError: Maximum retries attempted without success.

----

Python 3.3.5 has a two part log (https://travis-ci.org/wikimedia/pywikibot-core/jobs/37633111)

Traceback (most recent call last):
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 517, in submit
    headers=headers, body=body)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/tools.py", line 367, in wrapper
    return method(*__args, **__kw)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/comms/http.py", line 258, in request
    raise request.data
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/comms/threadedhttp.py", line 219, in request
    max_redirects, connection_type
  File "/home/travis/virtualenv/python3.3.5/lib/python3.3/site-packages/httplib2-0.9_pywikibot1-py3.3.egg/httplib2/__init__.py", line 1292, in request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "/home/travis/virtualenv/python3.3.5/lib/python3.3/site-packages/httplib2-0.9_pywikibot1-py3.3.egg/httplib2/__init__.py", line 1042, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/home/travis/virtualenv/python3.3.5/lib/python3.3/site-packages/httplib2-0.9_pywikibot1-py3.3.egg/httplib2/__init__.py", line 995, in _conn_request
    response = conn.getresponse()
  File "/opt/python/3.3.5/lib/python3.3/http/client.py", line 1147, in getresponse
    response.begin()
  File "/opt/python/3.3.5/lib/python3.3/http/client.py", line 358, in begin
    version, status, reason = self._read_status()
  File "/opt/python/3.3.5/lib/python3.3/http/client.py", line 320, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/opt/python/3.3.5/lib/python3.3/socket.py", line 297, in readinto
    return self._sock.recv_into(b)
  File "/opt/python/3.3.5/lib/python3.3/ssl.py", line 493, in recv_into
    return self.read(nbytes, buffer)
  File "/opt/python/3.3.5/lib/python3.3/ssl.py", line 367, in read
    v = self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/travis/build/wikimedia/pywikibot-core/tests/site_tests.py", line 616, in testImageusage
    for using in mysite.imageusage(imagepage, filterredir=True, total=5):
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 971, in __iter__
    self.data = self.request.submit()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 764, in submit
    self._data = super(CachedRequest, self).submit()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 535, in submit
    self.wait()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 648, in wait
    raise TimeoutError("Maximum retries attempted without success.")
pywikibot.data.api.TimeoutError: Maximum retries attempted without success.

The lowest travis job number i've found with this problem is 
https://travis-ci.org/wikimedia/pywikibot-core/jobs/37481646
Comment 1 John Mark Vandenberg 2014-10-13 05:48:16 UTC
This may have been caused by changing the travis retry number to 2 : https://gerrit.wikimedia.org/r/#/c/160000/ , which was merged on 13 Sept 3:30 AM

We could try increasing max_retries to 5 to see if that fixes the problem.

Another solution is to disable this test ;-)  Pending patch:
https://gerrit.wikimedia.org/r/160192 from bug 68671

But it is likely this test is doing something odd/extreme and the solution needs to be found.

Another possible cause is the logic to select a page for this test was changed 11 Sept. https://gerrit.wikimedia.org/r/#/c/159633/

Now that 13 Sept looks like the date this problem stated occurring _regularly_, I've found an earlier instance of the problem: 18 Sept 18 2014 https://gerrit.wikimedia.org/r/#/c/161201/
https://travis-ci.org/wikimedia/pywikibot-core/jobs/35616554

Next one I can find is 25 Sept https://gerrit.wikimedia.org/r/#/c/162863/
https://travis-ci.org/wikimedia/pywikibot-core/jobs/36296502

https://travis-ci.org/wikimedia/pywikibot-core/jobs/36903977
https://travis-ci.org/wikimedia/pywikibot-core/jobs/37111303
https://travis-ci.org/wikimedia/pywikibot-core/jobs/37270052

I was able to reproduce this timeout once on my workstation by running the test several times with en.wp set as my default site on a cold cache, sometimes it only times out once, but occasionally it times out three times.

rm -rf tests/apicache/; python -m unittest tests.site_tests.TestSiteObject.testImageusage

I can find pre-13 Sept instances of the same error in this test, but as the retry limit was very high the API code keeps trying to bypass the problem

https://travis-ci.org/wikimedia/pywikibot-core/jobs/35049020#L375

----

testImageusage on a cold cache runs 12 queries; on a warm cache it only makes one live request, caching the other 11.

(In the process I found a regression https://gerrit.wikimedia.org/r/#/c/166367/)

When run against en.wp as default site, site_tests.testImageusage uses [[en:File:Wiktionary-logo-en.svg]] which is used on [[en:!!]]

The query which takes the longest is:

https://en.wikipedia.org/w/api.php?inprop=protection&giulimit=5&maxlag=5&generator=imageusage&format=json&giufilterredir=redirects&prop=info%7Cimageinfo%7Ccategoryinfo&meta=userinfo&indexpageids=&action=query&giutitle=File%3AWiktionary-logo-en.svg&iiprop=timestamp%7Cuser%7Ccomment%7Curl%7Csize%7Csha1%7Cmetadata&uiprop=blockinfo%7Chasmsg

Running wget on that URL, repeatedly, doesnt reproduce the problem.
Comment 2 John Mark Vandenberg 2014-10-13 07:29:44 UTC
Note that the slow query is finding images on redirects only, which sounds like it should be a slow query. e.g.

https://en.wikipedia.org/w/index.php?title=Ambulance%20chaser&redirect=no
Comment 3 Gerrit Notification Bot 2014-10-13 07:38:47 UTC
Change 166375 had a related patch set uploaded by John Vandenberg:
Split testImageusage into four tests

https://gerrit.wikimedia.org/r/166375
Comment 4 John Mark Vandenberg 2014-10-13 23:17:06 UTC
testImageusage

https://travis-ci.org/wikimedia/pywikibot-core/jobs/37837770
https://travis-ci.org/wikimedia/pywikibot-core/jobs/37871882

Here is it failing _once_ only in py2.6

https://travis-ci.org/wikimedia/pywikibot-core/jobs/37876770

now also seeing the same problem with testUsercontribs

https://travis-ci.org/wikimedia/pywikibot-core/jobs/37871878
https://travis-ci.org/wikimedia/pywikibot-core/jobs/37871882

And possible unrelated testUsers has a new assert

FAIL: testUsers (tests.site_tests.SiteUserTestCase)
Test the site.users() method
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/wikimedia/pywikibot-core/tests/site_tests.py", line 1017, in testUsers
    us = list(mysite.users(mysite.user()))
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/site.py", line 3588, in users
    usprop="blockinfo|groups|editcount|registration|emailable"
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 1199, in __init__
    QueryGenerator.__init__(self, list=listaction, **kwargs)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 819, in __init__
    self.update_limit()  # sets self.prefix
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 914, in update_limit
    for param in self._modules[mod].get("parameters", []):
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 866, in _modules
    assert "paraminfo" in data
AssertionError

https://travis-ci.org/wikimedia/pywikibot-core/jobs/37871882

Please review r/166375 so that subsequent travis builds help narrow down the exact problem.

Could it be HHVM related?
Comment 5 Gerrit Notification Bot 2014-10-13 23:25:16 UTC
Change 166375 merged by jenkins-bot:
Split testImageusage into four tests

https://gerrit.wikimedia.org/r/166375
Comment 6 John Mark Vandenberg 2014-10-14 02:11:01 UTC
As expected it is test_image_usage_in_redirects which is causing the problem

https://travis-ci.org/wikimedia/pywikibot-core/jobs/37887126
https://travis-ci.org/wikimedia/pywikibot-core/jobs/37887726

I think the next step is:

set max-retries to 0 or 1 for all test runs, except keep travis at 2 to avoid red builds

create TestCase.expectedFailureIf , and add @expectedFailureIf(<travis>) to test_image_usage_in_redirect.

This will mean that developer testing will fail if it occurs on their workstation, but it wont cause red builds on Travis since we already know about this problem and are tracking it down.
Comment 7 Gerrit Notification Bot 2014-10-14 08:14:59 UTC
Change 166532 had a related patch set uploaded by John Vandenberg:
@expectedFailureIf(TRAVIS=true) on imageusage test

https://gerrit.wikimedia.org/r/166532
Comment 8 John Mark Vandenberg 2014-10-14 22:43:18 UTC
This one is very strange.  First is the error in test_image_usage_in_redirects , then one of the results in test_image_usage_no_redirect_filter is wrong!

https://travis-ci.org/wikimedia/pywikibot-core/jobs/37948029
Comment 9 Gerrit Notification Bot 2014-10-15 13:29:58 UTC
Change 166532 merged by XZise:
@expectedFailureIf(TRAVIS=true) on imageusage test

https://gerrit.wikimedia.org/r/166532
Comment 10 Mpaa 2014-10-15 20:36:38 UTC
I got this locally if it can help, on wikipedia:

rm -rf tests/apicache/; python -m unittest tests.site_tests.TestImageUsage.test_image_usage_in_redirects

site_tests.TestImageUsage found [[en:File:Wiktionary-logo-en.svg]] on [[en:!!]]
ERROR: Traceback (most recent call last):
  File "pywikibot/data/api.py", line 567, in submit
    headers=headers, body=body)
  File "pywikibot/tools.py", line 529, in wrapper
    return obj(*__args, **__kw)
  File "pywikibot/comms/http.py", line 258, in request
    raise request.data
SSLError: The read operation timed out

WARNING: Waiting 5 seconds before retrying.

...
Comment 11 Mpaa 2014-10-15 20:42:40 UTC
Next run was OK:

site_tests.TestImageUsage found [[en:File:Wiktionary-logo-en.svg]] on [[en:!!]]
[[en:Cock sucker]]
[[en:Ambulance chaser]]
[[en:Cyberpet]]
[[en:Colorfast]]
 35.620s .
----------------------------------------------------------------------
Ran 1 test in 35.621s

OK
Comment 13 John Mark Vandenberg 2014-10-21 05:43:34 UTC
Changes to the tests have been merged.  Underlying problem hasnt been fixed.
Comment 14 John Mark Vandenberg 2014-10-21 06:31:15 UTC
(In reply to John Mark Vandenberg from comment #8)
> This one is very strange.  First is the error in
> test_image_usage_in_redirects , then one of the results in
> test_image_usage_no_redirect_filter is wrong!
> 
> https://travis-ci.org/wikimedia/pywikibot-core/jobs/37948029

Another error in test_image_usage_no_redirect_filter

https://travis-ci.org/wikimedia/pywikibot-core/jobs/38570786

(It is probably not the same bug as the test_image_usage_in_redirect bug, and a new bug should be created once we have more information about the cause.)
Comment 15 Mpaa 2014-10-21 19:35:39 UTC
(In reply to John Mark Vandenberg from comment #13)
> Changes to the tests have been merged.  Underlying problem hasnt been fixed.

I do not think there is an underlying problem in test_image_usage_in_redirects.
The API is already slow by itself when called with those parameters.

One solution would be to change the logic that we currently use for setting the query limits.

I think this can be avoided just raising the number of retrieved tems from 5 to 500 instead of @expectedFailureIf(TRAVIS=true) on imageusage test.
Comment 16 John Mark Vandenberg 2014-11-07 06:43:47 UTC
I've forked the test_image_usage_no_redirect_filter problem to bug 73120.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links