Last modified: 2014-11-14 18:40:52 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57192, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55192 - Repetitive API userinfo queries
Repetitive API userinfo queries
Status: NEW
Product: Pywikibot
Classification: Unclassified
General (Other open bugs)
core-(2.0)
All All
: High normal
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks: pwb30
  Show dependency treegraph
 
Reported: 2013-10-05 04:40 UTC by Kunal Mehta (Legoktm)
Modified: 2014-11-14 18:40 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 04:40:50 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/1470/
Reported by: xqt
Created on: 2012-06-21 13:44:27
Subject: Rewrite Performance (multiple API request)
Original description:
There are multiple user info queries which slows down the performance:

c:\Pywikipedia\rw>pwb.py basic.py user:xqt/Test -simulate -v
Pywikipediabot r10326 2012-06-08 12:08:53Z
Python 2.7.3 \(default, Apr 10 2012, 23:24:47\) \[MSC v.1500 64 bit \(AMD64\)\]
Retrieving 1 pages from wikipedia:de.
Starting 1 threads...
API action query: userinfo
Found 1 wikipedia:de processes running, including this one.


>>> Benutzer:Xqt/Test <<<
\- Test 
\+ Test Test 
Comment: Bot: Ändere ...
Do you want to accept these changes? \(\[y\]es, \[N\]o\) y
API action query: userinfo
API action query: userinfo
Cosmetic changes for wikipedia-de enabled.
API action query: siteinfo|userinfo
API action query: userinfo
API action edit:
SIMULATION: edit action blocked.
Page \[\[Benutzer:Xqt/Test\]\] saved without any changes.
Page \[\[Benutzer:Xqt/Test\]\] saved
Dropped throttle\(s\).
Waiting for threads to finish...
All threads finished.
Dropped throttle\(s\).

c:\Pywikipedia\rw>
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 04:40:52 UTC
These are muliple API requests and I guess a lot of them could be cached by a site instance or on disk. This and other code parts decreases the performance of pwb 2.0 by 30% \(or increases the process by 50%\) meassured with touch.py -start:\! -pt:0
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 04:40:53 UTC
- **assigned_to**: russblau --> nobody
- **summary**: Multiple user info request --> Rewrite Performance (multiple API request)
Comment 3 John Mark Vandenberg 2014-11-14 13:24:34 UTC
Im not sure how the code looked before about April 2014 .. so my comment are unrelated to how the code looked when this bug was raised in 2012.

Since at least 2014, userinfo is added to every query, and the response is used to determine whether the server has a different username than pywikibot expects.
This occurs in usual usage for two reasons:

1. the bot starts logged out, but with the cookies sent, the server may reply with a username, in which case the server considers the bot logged in.  So pywikibot changes the login status of the APISite accordingly.

2. the server invalidates the bot's session, or maybe even credentials e.g. when we had a forced password reset.

So there are many API requests and responses with a small chunk of extra data.  This could be removed/reduced, with a lot of pain, and little gain.

There are also many times where the code base sends the exact same userinfo+siteinfo request several times, because the login code is a mess.  However, these are cached locally on disk - which is still a performance problem as this requires disk IO for a tiny chunk of data that the code has already parsed and discarded.

I fixed a few of these reload scenarios back in July/August, but it is not fun fiddling with the login/relogin sequence.

IMO we should wait until we've released a stable version of 2.0, and then redesign the user/login system, removing the two user system that is heavily embedded in the current codebase.  That will probably require a breaking change for sysop-bots, but bot-bots should be unaffected.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links