Wikidata as authority linking hub: The examples of RePEc and GND

Posted: January 16th, 2018 | Author: | Filed under: EDaWaX, found on the net | Tags: , | No Comments »

From time to time, I post a few more technical oriented articles on this blog. This one is about the opportunities to use Wikidata as an authority linking hub – e.g. for the purpose of correctly identifying authors of scientific publications or data. For developing research infrastructures, it is often is a complex task to offer suitable solutions for a correct identification of an author. Most often it is up to the researchers to correctly indicate their names and to provide a personal identifier (PI), like an ORCID-ID, or a RePEc short-ID.

On ZBW labs, my colleague Joachim Neubert a very interesting blog post about connecting such personal identifiers. In particular, he discusses the possibilities to connect researchers’ personal identifiers from the RePEc (Research Papers in Economics) Author Service (RAS) to those of the GND (Integrated Authority File).

The vehicle to connect these PIs is Wikidata. Wikidata is a large database, which connects all of the roughly 300 Wikipedia projects. Besides interlinking all Wikipedia pages in different languages about a specific item – e.g., a person -, it also connects to more than 1000 different sources of authority information. These characteristics make it possible to connect PIs, as Joachim illustrates by the example of EconBiz – ZBW’s portal for publications in economics. EconBiz  includes data from different sources. In some of these sources, authors are disambiguated by identifiers of the German Integrated Authority File (GND) – in total more than 470,000. Data stemming from “Research papers in Economics” (RePEc) contains another identifier: RePEc authors can register themselves in the RePEc Author Service (RAS), and claim their papers. This data is used for various rankings of authors and, indirectly, of institutions in economics, which provides a big incentive for authors – about 50,000 have signed into RAS – to keep both their article claims and personal data up-to-date. While GND is well known in Germany’s research infrastructure community and links to many other authorities, RAS had no links to any other researcher identifier system. Thus, until recently, the author identifiers were disconnected, which precludes the possibility to display all publications of an author on a portal page.

In a blog entry, Joachim presents an approach on how to overcome these limitations.

Joachim works in ZBW’s Department ‘Innovative Information Systems and Publishing Technologies (IIPT)’. In his position, he published the STW Thesaurus for Economics and the 20th Century Press Archives as Linked Open Data and developed linked data based web services for economics.

Grafic: “Linking Open Data cloud diagram 2017”, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/ License: CC-BY-SA 3.0



Leave a Reply

  •