Parse Wikidata Entries
Introduction
Parsing wikidata entries is not that trivial. Every information associated to an Entry is based on claims which are identified with a Property id. So either build your own lookup for the Properties you need and correlate them with the JSON export of an Entry or find a library that does that.
The inception year is "P571" which looks in the JSON like this (references removed):
"P571": [ { "mainsnak": { "snaktype": "value", "property": "P571", "hash": "51b760062c35d828aa817b95777ea0830b3c21ba", "datavalue": { "value": { "time": "+1922-00-00T00:00:00Z", "timezone": 0, "before": 0, "after": 0, "precision": 9, "calendarmodel": "http://www.wikidata.org/entity/Q1985727" }, "type": "time" }, "datatype": "time" }, "type": "statement", "id": "Q613766$5A06D69F-843D-4761-9228-537E0F56DB53", "rank": "normal", "references": [ "<snip>" ] } ]
But the only thing I need is the year, which is defined by the timestamp and precision (9 is for year). This seems like a lot of work to figure this out for every Property.
Wikipedia tools (for Humans)
Link: https://github.com/siznax/wptools
Example usage:
import wptools page = wptools.page(wikibase="Q613766", silent=True) page.get_wikidata() # inception year print(page.data["claims"]["P571"]) # -> ['+1922-00-00T00:00:00Z'] # show label of P571 print(page.data['labels']["P571"]) # -> inception
qwikidata
Link: https://github.com/kensho-technologies/qwikidata
Example usage:
from qwikidata.linked_data_interface import get_entity_dict_from_api from qwikidata.entity import WikidataItem entity = WikidataItem(get_entity_dict_from_api("Q613766")) # first P571 claim claim = entity.get_truthy_claim_group("P571")[0] # datavalue datavalue = claim.mainsnak.datavalue # value print(datavalue.value) # -> {'time': '+1922-00-00T00:00:00Z', 'timezone': 0, 'before': 0, 'after': 0, # 'precision': 9, 'calendarmodel': 'http://www.wikidata.org/entity/Q1985727'} # parsed value print(datavalue.get_parsed_datetime_dict()) # -> {'year': 1922, 'month': 0, 'day': 0, 'hour': 0, 'minute': 0, 'second': 0}
Conclusion
Depending on your needs "wptools" is abstracted a bit more and needs less knowledge about the interna of wikidata json. But some information is omitted. On the other hand "qwikidata" is pretty close to wikidata json but everything is there.