Last modified: 2014-02-02 19:33:59 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57127, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55127 - core hangs on killed http(s) connection
core hangs on killed http(s) connection
Status: RESOLVED FIXED
Product: Pywikibot
Classification: Unclassified
General (Other open bugs)
core-(2.0)
All All
: High major
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks: 58392
  Show dependency treegraph
 
Reported: 2013-10-05 04:29 UTC by Kunal Mehta (Legoktm)
Modified: 2014-02-02 19:33 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 04:29:05 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/1657/
Reported by: valhallasw
Created on: 2013-08-22 15:14:10.237000
Subject: core hangs on killed http(s) connection
Original description:
Steps to reproduce:

>>> p = pywikibot.Page(pywikibot.getSite('nl', 'wikipedia'), 'Wikipedia')
>>> p.get()

# now kill the connection, e.g. using tcpview in windows

>>> p = pywikibot.Page(pywikibot.getSite('nl', 'wikipedia'), 'Wikipedia')
>>> p.get()

# hangs



Expected result:
pywikibot reconnects and executes the .get()

Actual result:
complete hang (deadlock?)
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 04:29:08 UTC
Stack trace during hang:

*** STACKTRACE - START ***


# ThreadID: 140208457426688
File: "/usr/lib/python2.7/threading.py", line 524, in __bootstrap
  self.__bootstrap_inner()
File: "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
  self.run()
File: "connectionkilltest.py", line 16, in run
  for filename, lineno, name, line in traceback.extract_stack(stack):

# ThreadID: 140208472311552
File: "/usr/lib/python2.7/threading.py", line 524, in __bootstrap
  self.__bootstrap_inner()
File: "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
  self.run()
File: "pywikibot/comms/threadedhttp.py", line 343, in run
  item.data = self.http.request(*item.args, **item.kwargs)
File: "pywikibot/comms/threadedhttp.py", line 219, in request
  max_redirects, connection_type
File: "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1444, in request
  (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
File: "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1196, in _request
  (response, content) = self._conn_request(conn, request_uri, method, body, headers)
File: "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1166, in _conn_request
  response = conn.getresponse()
File: "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
  response.begin()
File: "/usr/lib/python2.7/httplib.py", line 407, in begin
  version, status, reason = self._read_status()
File: "/usr/lib/python2.7/httplib.py", line 365, in _read_status
  line = self.fp.readline()
File: "/usr/lib/python2.7/socket.py", line 430, in readline
  data = recv(1)
File: "/usr/lib/python2.7/ssl.py", line 241, in recv
  return self.read(buflen)
File: "/usr/lib/python2.7/ssl.py", line 160, in read
  return self._sslobj.read(len)

# ThreadID: 140208510514944
File: "pwb.py", line 103, in <module>
  run_python_file(fn, args)
File: "pwb.py", line 49, in run_python_file
  exec compile(source, filename, "exec") in main_mod.__dict__
File: "connectionkilltest.py", line 36, in <module>
  p.put_async('test edit')
File: "pywikibot/page.py", line 887, in put_async
  async=True, callback=callback, **kwargs)
File: "pywikibot/page.py", line 872, in put
  async=async, callback=callback, **kwargs)
File: "pywikibot/page.py", line 782, in save
  if not force and not self.botMayEdit():
File: "pywikibot/page.py", line 710, in botMayEdit
  username = self.site.user()
File: "pywikibot/site.py", line 206, in user
  if self.logged_in(True):
File: "pywikibot/site.py", line 842, in logged_in
  if sysop and 'sysop' not in self.userinfo['groups']:
File: "pywikibot/site.py", line 910, in getuserinfo
  uidata = uirequest.submit()
File: "pywikibot/data/api.py", line 284, in submit
  body=paramstring)
File: "pywikibot/comms/http.py", line 118, in request
  request.lock.acquire()
File: "/usr/lib/python2.7/threading.py", line 323, in acquire
  self.__cond.wait()
File: "/usr/lib/python2.7/threading.py", line 243, in wait
  waiter.acquire()

*** STACKTRACE - END ***


So: probably not a deadlock, just the ssl library hanging.
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 04:29:10 UTC
- **Priority**: 5 --> 2
Comment 3 Kunal Mehta (Legoktm) 2013-10-05 04:29:11 UTC
Works without any issues in windows - could be either related to linux (v. windows) or killing the connection locally or upstream (the original was run from a VM). Reducing priority.
Comment 4 Kunal Mehta (Legoktm) 2013-10-05 04:29:13 UTC
- **Group**: confirmed --> rewrite
Comment 5 xqt 2013-10-21 14:25:03 UTC
I do not know whether this is similar this bug reported above and am unable to investigate the reason of that bug. I only found the last output where the bot checks a wiki page every 15 seconds:


defendant: Textkontrolle accuser: Neozoon time: 2013 Okt 20 21:26
>> 05:20:37:

defendant: Textkontrolle accuser: Neozoon time: 2013 Okt 20 21:26
>> 05:20:54:

...
a lot of lines deleted
...
>> 05:42:19:
>> 05:42:33:
...
>> 05:53:41:
>> 05:53:55:
>> 05:54:10:
>> 05:54:24:
>> 05:54:38:
>> 05:54:53:
>> 05:55:07:
>> 05:55:21:

now it stops, CTL-C does not work, I have to kill the process via task manager.
I checked my router and found it had an autoreset of the internet connection 5 minutes before 6 a.m. this morning and I guess this was also yesterday. On the other hand the same bot worked well the three days before.
Comment 6 xqt 2013-10-21 14:28:09 UTC
Additional comment:
pwb.py version

Pywikibot: pywikibot/__init__.py (r-1 (unknown), ???????, 2013/09/21, 15:53:03,
OUTDATED)
Release version: 2.0b1
Python: 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)]
unicode test: ok
Comment 7 xqt 2013-10-25 16:23:27 UTC
Yes my bug depends on a killed internet connection while my router was off for 20 seconds!
Comment 8 xqt 2013-11-20 06:31:51 UTC
Here are the last log entries before the bot hanged:

2013-11-20 05:50:37             api.py,  685 in           __iter__: DEBUG    PropertyGenerator received [u'7165976']; limit=-1
2013-11-20 05:50:51 vm-auto-erl-neu.py,   79 in             output: INFO     >> 05:50:51: 
2013-11-20 05:50:51    threadedhttp.py,   97 in     pop_connection: DEBUG    Retrieved connection from 'https:de.wikipedia.org' pool.
2013-11-20 05:50:51    threadedhttp.py,  215 in            request: DEBUG    ('https://de.wikipedia.org/w/api.php', 'POST', 'maxlag=5000&format=json&bkprop=user%7Cby%7Ctimestamp%7Cexpiry%7Creason&list=blocks&bkusers=&meta=userinfo&action=query&bkend=20131120005959&uiprop=blockinfo%7Chasmsg', {'cookie': 'centralauth_Token=8c1df3ec61f7a92677f4c0b561c30585; centralauth_User=Xqbot; forceHTTPS=1; forceHTTPS=true; dewikiUserName=Xqbot; dewikiSession=3144518b40881659814792e3c0fc34cc; dewikiUserID=627628', 'connection': 'keep-alive', 'Content-Type': 'application/x-www-form-urlencoded', 'user-agent': 'pwb/r-1 (unknown) Pywikipediabot/2.0'}, 5, None)
2013-11-20 05:50:51             api.py,  308 in             submit: DEBUG    API response received:
{"query":{"blocks":[],"userinfo":{"id":627628,"name":"Xqbot"}}}
2013-11-20 05:50:51             api.py,  611 in       update_limit: DEBUG    PropertyGenerator: Set query_limit to 5000.
2013-11-20 05:50:51    threadedhttp.py,   97 in     pop_connection: DEBUG    Retrieved connection from 'https:de.wikipedia.org' pool.
2013-11-20 05:50:51    threadedhttp.py,  215 in            request: DEBUG    ('https://de.wikipedia.org/w/api.php', 'POST', 'maxlag=5000&format=json&rvprop=ids%7Cflags%7Ctimestamp%7Cuser%7Ccomment%7Ccontent&prop=info%7Crevisions&titles=Wikipedia%3AVandalismusmeldung&meta=userinfo&indexpageids=&action=query&uiprop=blockinfo%7Chasmsg', {'cookie': 'centralauth_Token=8c1df3ec61f7a92677f4c0b561c30585; centralauth_User=Xqbot; forceHTTPS=1; forceHTTPS=true; dewikiUserName=Xqbot; dewikiSession=3144518b40881659814792e3c0fc34cc; dewikiUserID=627628', 'connection': 'keep-alive', 'Content-Type': 'application/x-www-form-urlencoded', 'user-agent': 'pwb/r-1 (unknown) Pywikipediabot/2.0'}, 5, None)
Comment 9 Gerrit Notification Bot 2014-01-19 12:49:02 UTC
Change 108325 had a related patch set uploaded by Ladsgroup:
[BUGFIX] fixing bug 55127

https://gerrit.wikimedia.org/r/108325
Comment 10 Merlijn van Deen (test) 2014-02-01 21:41:24 UTC
Minimal test case for the SSL issue:

import httplib2

h = httplib2.Http()
resp, content = h.request("https://en.wikipedia.org/", "GET")
raw_input("kill the connection...")
resp, content = h.request("https://en.wikipedia.org/", "GET")
print "done"


So there are three issues:
 - A connection can hang indefinitely if the connection is killed. There should be some timeout for receiving some data here.
 - The framework does not respond to ctrl-c when a thread doesn't return. This can be solved by changing pywikibot/comms/http.py to be non-blocking (e.g. in a loop with a 100ms time.sleep)
 - Threads do not receive KeyboardInterrupts (which is OK!) but this means a non-returning call will hang forever (i.e. will not be killed when python exits)

Amirs solution might be useful for the last issue, but I'm not 100% sure about that. I think I'd rather hook the atexit handler to kill all processing threads. Maybe we can also start threads as daemons, which should kill them automatically when the main thread exits...
Comment 11 xqt 2014-02-02 09:55:20 UTC
I tried this on _windows_ but I didn't got it hanging:

Traceback (most recent call last):
  File "<pyshell#4>", line 1, in <module>
    resp, content = h.request("https://en.wikipedia.org/", "GET")
  File "C:\Pywikipedia\ssh\pywikibot\core\externals\httplib2\python2\httplib2\__init__.py", line 1591, in request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "C:\Pywikipedia\ssh\pywikibot\core\externals\httplib2\python2\httplib2\__init__.py", line 1333, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "C:\Pywikipedia\ssh\pywikibot\core\externals\httplib2\python2\httplib2\__init__.py", line 1289, in _conn_request
    response = conn.getresponse()
  File "C:\Python27\lib\httplib.py", line 1030, in getresponse
    response.begin()
  File "C:\Python27\lib\httplib.py", line 407, in begin
    version, status, reason = self._read_status()
  File "C:\Python27\lib\httplib.py", line 365, in _read_status
    line = self.fp.readline()
  File "C:\Python27\lib\socket.py", line 430, in readline
    data = recv(1)
  File "C:\Python27\lib\ssl.py", line 241, in recv
    return self.read(buflen)
  File "C:\Python27\lib\ssl.py", line 160, in read
    return self._sslobj.read(len)
error: [Errno 10060] Ein Verbindungsversuch ist fehlgeschlagen, da die Gegenstelle nach einer bestimmten Zeitspanne nicht richtig reagiert hat, oder die hergestellte Verbindung war fehlerhaft, da der verbundene Host nicht reagiert hat
Comment 12 Gerrit Notification Bot 2014-02-02 14:02:37 UTC
Change 110853 had a related patch set uploaded by Merlijn van Deen:
(bug 55127) Implement network timeouts

https://gerrit.wikimedia.org/r/110853
Comment 13 Merlijn van Deen (test) 2014-02-02 14:04:34 UTC
This solves the basic issue, but does not fix the other two issues (the main thread will not respond to ctrl-c until the timeout is reached; also the thread will not be stopped until the timeout is raised)
Comment 14 Gerrit Notification Bot 2014-02-02 17:52:16 UTC
Change 110866 had a related patch set uploaded by Merlijn van Deen:
(bug 55127) Allow user to cut off pending network requests at shutdown

https://gerrit.wikimedia.org/r/110866
Comment 15 Gerrit Notification Bot 2014-02-02 17:52:47 UTC
Change 110867 had a related patch set uploaded by Merlijn van Deen:
(bug 55127) Allow user to abort during http request

https://gerrit.wikimedia.org/r/110867
Comment 16 Merlijn van Deen (test) 2014-02-02 17:53:12 UTC
And with the latest two patches, this should also be fixed.
Comment 17 Gerrit Notification Bot 2014-02-02 18:58:33 UTC
Change 110853 merged by jenkins-bot:
(bug 55127) Implement network timeouts

https://gerrit.wikimedia.org/r/110853
Comment 18 Gerrit Notification Bot 2014-02-02 18:58:36 UTC
Change 110866 merged by jenkins-bot:
(bug 55127) Allow user to cut off pending network requests at shutdown

https://gerrit.wikimedia.org/r/110866
Comment 19 Gerrit Notification Bot 2014-02-02 18:58:38 UTC
Change 110867 merged by jenkins-bot:
(bug 55127) Allow user to abort during http request

https://gerrit.wikimedia.org/r/110867

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links