Last modified: 2014-07-18 14:45:27 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T56235, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 54235 - Claim.fromJSON() fails on a deleted property
Claim.fromJSON() fails on a deleted property
Status: RESOLVED FIXED
Product: Pywikibot
Classification: Unclassified
Wikidata (Other open bugs)
core-(2.0)
All All
: Low normal
: ---
Assigned To: Ricordisamoa
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-17 19:12 UTC by Maarten Dammers
Modified: 2014-07-18 14:45 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Maarten Dammers 2013-09-17 19:12:51 UTC
Today my bot encountered https://www.wikidata.org/wiki/Q3211336 . This contains the deleted property P288.

This crashes the bot:

Traceback (most recent call last):
  File "C:\pywikibot\core\article_to_category.py", line 97, in <module>
    main()
  File "C:\pywikibot\core\article_to_category.py", line 94, in main
    bot.run()
  File "C:\pywikibot\core\article_to_category.py", line 52, in run
    if not u'P910' in itemB.get().get('claims').keys():
  File "C:\pywikibot\core\pywikibot\page.py", line 2571, in get
    c = Claim.fromJSON(self.repo, claim)
  File "C:\pywikibot\core\pywikibot\page.py", line 2776, in fromJSON
    if claim.getType() == 'wikibase-item':
  File "C:\pywikibot\core\pywikibot\page.py", line 2715, in getType
    self.type = self.repo.getPropertyType(self)
  File "C:\pywikibot\core\pywikibot\site.py", line 3533, in getPropertyType
    dtype = data['entities'][prop.getID().lower()]['datatype']
KeyError: u'p288'


If you look at https://git.wikimedia.org/blob/pywikibot%2Fcore.git/a1d63c8ba608b87d3a3aff6acc8dbbc29085eb8f/pywikibot%2Fpage.py from line 2776 and furthere, you'll see that it tries to lookup the type of the property. That fails because the property is deleted. 

I don't think the lookup is needed, see https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q3211336&format=json :

"P288": [

    {
        "id": "q3211336$00f2cba0-4c06-ee8d-9598-d8c034b663ed",
        "mainsnak": {
            "snaktype": "value",
            "property": "P288",
            "datavalue": {
                "value": {
                    "entity-type": "item",
                    "numeric-id": 2304194
                },
                "type": "wikibase-entityid"
            }
        },
        "type": "statement",
        "rank": "normal"
    }

]

This should be used to construct the claim instead of the extra lookup
Comment 1 Kunal Mehta (Legoktm) 2013-09-17 21:45:58 UTC
Unfortunately that won't work in some cases like commonsMedia or url. If you look at 
https://www.wikidata.org/w/api.php?action=wbgetentities&ids=q76&format=jsonfm:

                            "property": "P18",
                            "datavalue": {
                                "value": "President Barack Obama.jpg",
                                "type": "string"
                            }

We could probably implement a fallback for the ones that will work like entity, coordinate and time, and emit a warning like "P288 does not exist"

This error is probably rare to run into since the datatype is cached forever, but most scripts need to re-cache everything because the ids went uppercase.
Comment 2 John Mark Vandenberg 2014-05-30 07:27:59 UTC
It looks like the problem mentioned in comment 1 is no longer an problem, as the JSON now includes the mainsnak datatype, whereas I assume it didnt previously based on comment 1.

e.g. q76 JSON now includes "datatype"

"P18": [
    {
        "id": "q76$AA25BE21-FFCA-4C0A-A7B4-C041CBE549F7",
        "mainsnak": {
            "snaktype": "value",
            "property": "P18",
            "datatype": "commonsMedia",
            "datavalue": {
                "value": "President Barack Obama.jpg",
                "type": "string"
            }
        },

I ran into a very tightly related bug while loading items from an xml dump, without access to the wiki the dump was from - such as with a firewall in the way.

If I understand this correctly, any time the JSON parser finds a 'datatype' in a snak, it can use that instead of doing a lookup.  The JSON *may* not include this, in which a lookup is required, and if a lookup isnt possible then the code should raise an appropriate exception.
Comment 3 Kunal Mehta (Legoktm) 2014-05-30 18:10:42 UTC
(In reply to John Mark Vandenberg from comment #2)
> It looks like the problem mentioned in comment 1 is no longer an problem, as
> the JSON now includes the mainsnak datatype, whereas I assume it didnt
> previously based on comment 1.
> 

Awesome!

 
> If I understand this correctly, any time the JSON parser finds a 'datatype'
> in a snak, it can use that instead of doing a lookup.  The JSON *may* not
> include this, in which a lookup is required, and if a lookup isnt possible
> then the code should raise an appropriate exception.

Yup, sounds good.
Comment 4 Gerrit Notification Bot 2014-06-05 19:19:28 UTC
Change 137737 had a related patch set uploaded by Ricordisamoa:
initialize PropertyPage directly with datatype

https://gerrit.wikimedia.org/r/137737
Comment 5 Gerrit Notification Bot 2014-07-18 14:16:15 UTC
Change 137737 merged by jenkins-bot:
initialize Property directly with datatype

https://gerrit.wikimedia.org/r/137737

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links