Last modified: 2014-11-09 02:47:32 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T68102, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 66102 - use one library for all http requests
use one library for all http requests
Status: NEW
Product: Pywikibot
Classification: Unclassified
network (Other open bugs)
core-(2.0)
All All
: Unprioritized normal
: ---
Assigned To: John Mark Vandenberg
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-06-04 00:00 UTC by John Mark Vandenberg
Modified: 2014-11-09 02:47 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description John Mark Vandenberg 2014-06-04 00:00:40 UTC
pywiki mostly depends on httplib2.

There are a few cases of urllib.urlopen (and others) being used in the pywikibot library code, and a number of scripts which use other http request routines.

Multiple routines results in multiple configuration and multiple sets of possible errors.

Has there been any investigation whether requests or urllib3 would suit our needs better (e.g. offloading some problems onto another project)?
Comment 1 John Mark Vandenberg 2014-07-01 02:33:28 UTC
The may be issues with using httplib2 for large downloads, like are possible in upload.py.

https://github.com/jcgregorio/httplib2/issues/224

A fork has been created for that, and distributed caching.

https://github.com/madlag/streaming_httplib2
Comment 2 John Mark Vandenberg 2014-08-06 21:07:56 UTC
site.py & weblib.py use 'import urllib', but for urlencode

urllib:
pywikibot/page.py:1841:        f = urllib.urlopen(self.fileUrl())
pywikibot/version.py:199:    buf = urllib.urlopen(url).readlines()

scripts/upload.py
scripts/flickrripper.py
scripts/checkimages.py
scripts/weblinkchecker.py
scripts/imagerecat.py
scripts/maintenance/wikimedia_sites.py
scripts/data_ingestion.py

urllib2:
scripts/reflinks.py

httplib (not httplib2):
pywikibot/version.py:123:    conn = httplib.HTTPSConnection('github.com')

scripts/weblinkchecker.py
scripts/reflinks.py
Comment 3 Gerrit Notification Bot 2014-08-07 00:33:08 UTC
Change 152200 had a related patch set uploaded by John Vandenberg:
HTTP requests with user-agent without version

https://gerrit.wikimedia.org/r/152200
Comment 4 Gerrit Notification Bot 2014-08-10 12:19:42 UTC
Change 153300 had a related patch set uploaded by John Vandenberg:
Replace httplib and urllib with httplib2

https://gerrit.wikimedia.org/r/153300
Comment 5 Gerrit Notification Bot 2014-09-03 21:11:07 UTC
Change 152200 merged by jenkins-bot:
User-agent graceful degradation

https://gerrit.wikimedia.org/r/152200
Comment 6 Gerrit Notification Bot 2014-09-04 20:34:09 UTC
Change 153300 merged by jenkins-bot:
Replace httplib and urllib with httplib2

https://gerrit.wikimedia.org/r/153300
Comment 7 John Mark Vandenberg 2014-10-06 11:21:13 UTC
version.py now uses httplib2.

In addition to the list above, generate_family_file.py also uses urllib2
Comment 8 John Mark Vandenberg 2014-11-09 02:47:32 UTC
https://github.com/ross/python-asynchttp might be a good solution, but it doesnt appear to be very active

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links