Last modified: 2014-11-20 21:01:16 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T74729, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 72729 - Add Wikibase API module that is usable from client wikis and available as a generator & prop module
Add Wikibase API module that is usable from client wikis and available as a g...
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
WikidataClient (Other open bugs)
unspecified
All All
: High normal (vote)
: ---
Assigned To: Wikidata bugs
https://www.mediawiki.org/wiki/Reques...
u=dev c=backend p=0
:
Depends on:
Blocks: 73616
  Show dependency treegraph
 
Reported: 2014-10-30 01:12 UTC by Ryan Kaldari
Modified: 2014-11-20 21:01 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Ryan Kaldari 2014-10-30 01:12:57 UTC
Extend MediaWiki API Query module to support basic Wikidata data retrieval locally. This would allow Wikidata data to be included as part of other API queries and even use it with generators (https://www.mediawiki.org/wiki/API:Query#Generators). Minimum requirement would be to retrieve wikidata descriptions using page titles or ids. (This would facilitate their use in search suggestions.) Other possible capabilities would include retrieving the Wikidata labels, aliases, claims, and inter-language links.
Comment 1 Kunal Mehta (Legoktm) 2014-11-03 18:34:56 UTC
So...basically implement https://www.mediawiki.org/wiki/Requests_for_comment/Wikidata_API ?
Comment 2 Ryan Kaldari 2014-11-03 18:40:59 UTC
Legoktm: Basically, yes.
Comment 3 Daniel Kinzler 2014-11-06 16:07:21 UTC
Yuri's RFC is for use on the Repo, though. The idea there is to use Wikibase stuff as generators. Ryan's request, if I understand it correctly, is to implement a property module that can be used to provide extra properties for pages listed by a generator on a client wiki.
Comment 4 Daniel Kinzler 2014-11-06 16:17:03 UTC
If I understand correctly, the intended use case is this: you have a list of local pages titles (e.g. from a prefix search), and want to list the; in the listing, you want to show some extra info from Wikidata, like the description. The suggestion is to allow API queries to include this extra information using an API prop module.

This could be done, but I wonder whether it's worth the effort. You can get the same info easily from Wikidata directly, with a single API call. For example, to get the wikidata labels and descriptions, in English, associated with the Pages Birch, Beech, and Beetle on enwiki, you can use the following query:

http://www.wikidata.org/w/api.php?action=wbgetentities&format=json&sites=enwiki&titles=Birch%7CBeech%7CBeetle&props=labels%7Cdescriptions&languages=en%7Cen-ca%7Cen-gb

Isn't this sufficient?
Comment 5 Ryan Kaldari 2014-11-06 19:20:07 UTC
Yes, that's basically what folks are currently doing, but it isn't ideal. Ideally, we would like to be able to get regular page props and wikidata data from a single API call. Also, we would like to avoid the extra DNS lookup of an external HTTP request in high-traffic contexts (like search suggestions) if possible.
Comment 6 Dan Garry 2014-11-07 02:08:11 UTC
I second what Kaldari has said. Sure, it's sufficient, but it shouldn't be necessary. :-)
Comment 7 Daniel Kinzler 2014-11-15 16:17:51 UTC
Considering that with my approach, you would be hitting wbgetentities with a couple of hundreds of queries from the mobile search interface, I suppose you are right: that isn't going to work. wbgetentities needs to load the full entity structure from the blob store, that's slow...

We already have the data you wan in the wb_terms table. I suppose adding a client side module that works much like the ApiQueryPageProps would be easy enough, and should make this a lot faster. 

I can't promise that it will be performant enough though, I hear the API servers are pretty loaded. An alternative solution would be to add this information directly to Elastic, so it can be returned directly by the search module.

By the way, what do you use to generate the original list of local page titles? action=opensearch? action=wbsearchentities?
Comment 8 Daniel Kinzler 2014-11-16 15:52:20 UTC
I have implemented a pageterms module, see  I9b6b52f6b75e4d6a
Comment 9 Bernd Sitzmann 2014-11-17 19:40:43 UTC
Daniel, the apps currently use both prefixsearch and search generators. I can't speak for mobile web, but I guess it's similar. When the user clicks search we perform a title search first, then allow the user to switch to full text search from there. We currently have to collect the wikibase_items and then send off another request to wikidata.org to get the descriptions. Like Kaldari mentioned above, we would like to avoid that. Below are some examples we have currently implemented. 

(1) Title search:
https://en.m.wikipedia.org/w/api.php?action=query&format=json&generator=prefixsearch&gpssearch=foo&gpsnamespace=0&gpslimit=12&prop=pageprops%7Cpageimages&ppprop=wikibase_item&piprop=thumbnail&pithumbsize=96&pilimit=12&list=prefixsearch&pssearch=formula&pslimit=12

(2) Full text search:
https://en.m.wikipedia.org/w/api.php?action=query&format=json&prop=pageprops%7Cpageimages&ppprop=wikibase_item&generator=search&gsrsearch=foo&gsrnamespace=0&gsrwhat=text&gsrinfo=&gsrprop=redirecttitle&gsroffset=0&gsrlimit=12&list=search&srsearch=foo&srnamespace=0&srwhat=text&srinfo=suggestion&srprop=&sroffset=0&srlimit=12&piprop=thumbnail&pithumbsize=96&pilimit=12
Comment 10 Nemo 2014-11-20 21:01:16 UTC
As Kaldari noted, "PageTerms" is not self-explanatory. My first thought was it would contain per-page legal terms (e.g. license).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links