Last modified: 2013-10-25 14:32:15 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57100, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55100 - Use API module 'parse' for retrieving interwiki links
Use API module 'parse' for retrieving interwiki links
Status: RESOLVED FIXED
Product: Pywikibot
Classification: Unclassified
interwiki.py (Other open bugs)
unspecified
All All
: Low enhancement
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-05 04:23 UTC by Kunal Mehta (Legoktm)
Modified: 2013-10-25 14:32 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 04:23:26 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/feature-requests/151/
Reported by: melancholie
Created on: 2008-06-13 14:47:11
Subject: Use API module 'parse' for retrieving interwiki links
Original description:
Currently pages are retrieved in a batch by using Special:Export.
Although being fast \(as only one request is done\), there is a huge data overhead with this method\!

Why not use the API with its 'parse' module? Only interwiki links can be fetched with that, reducing traffic \(overhead\) a lot\!

See:
http://de.wikipedia.org/w/api.php?action=parse&format=xml&page=Test&prop=langlinks

Outputs could be downloaded in parallel to virtualize a batch \(faster\).

\----
At least make this method optional \(config.py\) for being able of reducing data traffic, if wanted. API is just more efficient.
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 04:23:28 UTC
- **priority**: 5 --> 7
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 04:23:30 UTC
Logged In: YES 
user\_id=2089773
Originator: YES

Note: Maybe combine it with 'generator'.
Comment 3 Kunal Mehta (Legoktm) 2013-10-05 04:23:31 UTC
- **summary**: Use API module parse for retrieving interwiki links --> Use API module 'parse' for retrieving interwiki links
Comment 4 Kunal Mehta (Legoktm) 2013-10-05 04:23:33 UTC
Logged In: YES 
user\_id=2089773
Originator: YES

Important note for getting pages' interwikis in a batch:
http://de.wikipedia.org/w/api.php?action=parse&text=\{\{:Test\}\}\{\{:Bot\}\}\{\{:Haus\}\}&prop=langlinks

Either the bot could figure out what interwikis belong together then, or

maybe a marker could placed in between:
http://de.wikipedia.org/w/api.php?action=parse&text=\{\{:Test\}\}\{\{MediaWiki:Iwmarker\}\}\{\{:Bot\}\}\{\{MediaWiki:Iwmarker\}\}\{\{:Haus\}\}&prop=langlinks

\[\[MediaWiki:Iwmarker\]\] \(or 'Llmarker'?\) would have to be set up by the MediaWiki developers with \[\[en:/de:Abuse-save-mark\]\] as content \(but this is potentially misusable\).
Comment 5 Kunal Mehta (Legoktm) 2013-10-05 04:23:35 UTC
Logged In: YES 
user\_id=2089773
Originator: YES

For not being misusable of confusing bots, the yet to be set up MediaWiki message could contain \[\[foreigncode:\{\{CURRENTTIMESTAMP\}\}\]\] \(cache issue?\)

\(sorry for spamming with this request ;-\)
Comment 6 Kunal Mehta (Legoktm) 2013-10-05 04:23:37 UTC
Logged In: YES 
user\_id=1806226
Originator: NO

Backwards compatibility with non Wikimedia wikis?
Comment 7 Kunal Mehta (Legoktm) 2013-10-05 04:23:38 UTC
Logged In: YES 
user\_id=2089773
Originator: YES

Backwards compatibility?

That's no reason for not making software more efficient, where possible ;-\)
That's also why I wrote something about "optional", too.
Because for current MediaWiki wikis there is a much more efficient way of retrieving \(only\) certain contents \(langlinks, categories\), there should be a method of using that advantage\! Will reduce load \(bot owner's and server's\)...
Comment 8 Kunal Mehta (Legoktm) 2013-10-05 04:23:40 UTC
Logged In: YES 
user\_id=2089773
Originator: YES

See http://meta.wikimedia.org/wiki/Interwiki\_bot\_access\_protocol concerning disambiguations and redirects:

http://de.wikipedia.org/w/api.php?action=parse&format=xml&text=\{\{:Main\_Page\}\}\{\{:Bot\}\}&prop=langlinks|templates
Comment 9 Kunal Mehta (Legoktm) 2013-10-05 04:23:42 UTC
We are working on a rewrite. The rewrite uses the api as much as possible.
Comment 10 Kunal Mehta (Legoktm) 2013-10-05 04:23:44 UTC
parse mode is deactivated due to overloading the squids. Nothing to do now.
Comment 11 Kunal Mehta (Legoktm) 2013-10-05 04:23:45 UTC
- **priority**: 7 --> 1
Comment 12 Amir Ladsgroup 2013-10-25 14:32:15 UTC
Fixed in http://www.mediawiki.org/wiki/Special:Code/pywikipedia/11229

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links