Mapping Research Core Dataset (KDSF Basic Data) with the CERIF Data Model version 1.6 (Basic Elements, Result Elements, and Infrastructure Elements)

Please click here for a detailed introduction into the KDSF and CERIF data models and their mapping [in German].

Higher education institutions and non-university research institutions in many science systems make use of research information systems (RIS) to process and report on information about their staff, projects, publications or patents (so-called research information). National and international exchange formats or standards have been developed to support information processing in RIS and to enhance compatibility and interoperability of different systems. The following table presents a mapping of the basic data model of the “Research Core Dataset” (RCD, "Kerndatensatz Forschung" (KDSF) in German) with the CERIF standard (short for Common European Research Information Format). The RCD standard aims to harmonize the reporting of research information in the German science system. CERIF is a voluntary standard model for the storage, management and exchange of research information and recommended by euroCRIS to be implemented in the development and use of Current Research Information Systems (CRIS) [1]. The following mapping specifies, which elements of the RCD basic data model are also part of CERIF. The mapping is intended to illustrate the structure and content of the RCD data model as well as to highlight differences and similarities between the two approaches. The data models of the RCD and CERIF have not yet been mapped previously except for preliminary results of the mapping produced by the research project for the specification of the RCD [2],[3]. Both CERIF and the RCD have been mapped each to the data model of the open source research information system VIVO [4],[5]. The following procedure has been employed: The mapping is based on a manual comparison of equivalence relationships between core entities and attributes of the respective Entity Relationship Models (ERM) of both standard specifications.

Overall, CERIF is much more extensive than the RCD. CERIF is a very extensive data model and contains a much larger number of time-stamped entities, related attributes and relationships compared to the RCD (see below the table of model metrics for illustration). In contrast, the focus of RCD lies on an aggregated presentation of research information for reporting from the perspective of the reporting institution. The strong focus on the reporting perspective also predetermines the temporal dimensions of the collected basic elements. Summing up the major results of the mapping, it can be concluded that a large part of the basic elements mentioned in the RCD is available in CERIF. However, the areas of structured doctoral program and spin-offs are not modeled in CERIF. The results of the mapping are also discussed in a recent manuscript [currently under review]. The mapping created is aimed at technical experts (e.g. developers of RIS) who deal with the topic.

CERIF and RCD model metrics

Due to the construction logic of the RCD based on the economy of data and the focus on a small set of versatile core data to be collected and reported for all German HEI and non-university research institutions, the number of entities of the RCD data model is much smaller compared to CERIF. In the RCD data model, attributes are related to these entities. As opposed to this, the CERIF data model also allows for attributes to be expressed via relations as contained in link entities. The number of classification schemes used in CERIF is optional and can be extended as part of the flexible semantic layer of CERIF. Contrary to this, the RCD data model relies on a smaller fixed set of standardized classification schemes.




Comparison conditions Elements are included in RCD and CERIF       Elements are in RCD, not in CERIF


Area Person (person)
 
RCD Attribute      Entity      Mapping with CERIF 
Altersgruppe (age group)      person      not modeled 
Geburtsdatum (birthdate)      person      cfPers.cfBirthdate 
Geschlecht (gender)      person      cfPers.cfGender 
Name (name)      person      cfPersName_Pers, cfPersName 
Staatsangehörigkeit (citizenship)      person      cfPers_Country 
hat Promotionsberechtigung aus (has doctoral eligibility from)      person      cfPers_Country 
Qualifikation (qualification)      employee      cfPers_Qual, cfQualTitle 
Besoldung (salary)      professors      not modeled 
Bezeichnung der Professur (professorial title)      professors      not modeled 
Gemeinsame Berufung (joint appointment)      professors      not modeled 
Abschlusszeitpunkt (completion date)      qualification procedure      not modeled 
Altersgruppe bei Abschluss (age group at graduation)      qualification procedure      not modeled 
Kooperationspartner bei kooperativer Promotion (cooperation partner in cooperative doctorate)      doctoral candidate      not modeled 
Start der Promotion - titelvergebende Einrichtungen (start of the doctorate - institution awarding the title)      doctoral candidate      not modeled 
Start der Promotion - nicht titelvergebende Einrichtungen (start of the doctorate - non-title awarding institution)      doctoral candidate      not modeled 
Zeitpunkt des Abschlusses des Promotionsverfahrens (time of completion of the doctoral procedure)      doctoral candidate      not modeled 
hat Organisationseinheit (has organizational unit)      person      cfPers_OrgUnit 
hat Qualifikationsverfahren (has qualification procedure)      person      not modeled 
hat Beschäftigung (has employment)      employee      not modeled 
hat Befristung (has limited term contract)      employement      not modeled 
hat Fach (has discipline)      employement      not modeled 
hat Finanzierungsform (has form of financing)      employement      not modeled 
hat Forschungsfeld (has research field)      employement      not modeled 
hat Personalkategorie (has staff category)      employement      not modeled 
hat Tätigkeitsart (has type of work)      employement      not modeled 
hat Bezeichnung (has name)      professors      not modeled 
hat gemeinsame Berufung (has joint appointment)      professors      not modeled 
hat Erstbetreuer (has supervisor)      doctoral candidate      not modeled 
hat Kooperationspartner (has cooperation partner)      doctoral candidate      not modeled 
hat Strukturiertes Promotionsprogramm (has structured doctoral program)       doctoral candidate      not modeled 


Area Drittmittelprojekt (third-party funded project)
 
RCD Attribute      Entity      Mapping with CERIF 
Bewilligungssumme (amount of funding)      third-party funded project      cfProj_Fund::cfAmount 
Drittmitteleinnahmen (third-party funding simple entry accountancy)      third-party funded project      not modeled 
Drittmittelerträge (third-party funding double entry accountancy)      third-party funded project      not modeled 
Förderkennzeichen (grant number)      third-party funded project      not modeled 
Koordinationsrolle (coordinating role)      third-party funded project      not modeled 
KoordinatorEinrichtung (coordinating institution)      third-party funded project      cfProj_OrgUnit 
Projektbeginn (start of project)      third-party funded project      cfProj.cfStartDate 
Projektende (end of project)      third-party funded project      cfProj.cfEndDate 
Titel des Projekts (title of project)      third-party funded project      cfProjTitle  
hat Fach (has discipline)      third-party funded project      cfProj_Class* 
hat Forschungsfeld (has research field)      third-party funded project      cfProj_Class* 
hat Mittelgeber (has type of funder)      third-party funded project      cfProj_Class* 
hat Organisationseinheit (has organizational unit)      third-party funded project      cfProj_cfOrgUnit 
hat übergeordnetes Projekt (has superordinate project)      third-party funded project      cfProj_Proj 


Area Strukturiertes Promotionsprogramm (structured doctoral program)
 
RCD Attribute      Entity      Mapping with CERIF 
Beteiligte Institutionen (participating institutions)      structured doctoral program      not modeled 
Länder der beteiligten Institutionen (country of participating institutions)      structured doctoral program      not modeled 
Titel des Promotionsprogramms (title of doctoral program)      structured doctoral program      not modeled 
hat Fach (has discipline)      structured doctoral program      not modeled 
hat Finanzierungsform (has form of financing)       structured doctoral program      not modeled 
hat Forschungsfeld (has research field)      structured doctoral program      not modeled 
hat Organisationseinheit (has organizational unit)      structured doctoral program      not modeled 
hat Sprecher (has speaker)      structured doctoral program      not modeled 
hat laufende Promotion (has ongoing doctorate)      structured doctoral program      not modeled 


Area Patent (patent)
 
RCD Attribute      Entity      Mapping with CERIF 
Datum der prioritätsbegründenden Erstanmeldung (date of first priority application)      patent      cfResPat.cfRegistrDate 
Titel des Patents (title of patent)      patent      cfResPatTitle 
Veröffentlichungsnummer (publication number)      patent      cfResPat.cfPatentNum 
Anzahl der Patentfamilien (number of patent families)      patent      not modeled 
Datum der erteilten Patente (date of granted patents)      patent      cfResPat.cfApprovDate 
hat Erfinder (has inventor)      patent      cfPers_ResPat 
hat Fach (has discipline)      patent      cfResPat_Class* 
hat Forschungsfeld (has research field)      patent      cfResPat_Class* 
hat Organisationseinheit (has organizational unit)      patent      cfOrgUnit_ResPat 


Area Ausgründung (spin-off)
 
RCD Attribute      Entity      Mapping with CERIF 
Name der Ausgründung (name of spin-off)      spin-off      not modeled 
Datum der Ausgründung (date of spin-off)      spin-off      not modeled 
hat Fach (has discipline)      spin-off      not modeled 
hat Forschungsfeld (has research field)      spin-off      not modeled 
hat Organisationseinheit (has organizational unit)      spin-off      not modeled 


Area Publikation (publication)
 
RCD Attribute      Entity      Mapping mit CERIF 
Förderer (funding organisation of publication)      publication      not modeled 
Förderkennzeichen (grant number of publication)      publication      not modeled 
Identifier (identifier)      publication      cfFedId, cfFedId_Class*, cfDCResourceIdentifier** 
ist Peer Reviewed (is peer-reviewed)      publication      cfResPubl_Class* 
Ressource (resource)      publication      cfDCResourceType** 
Sprachcode (language code)      publication      cfDCLanguage**  
Titel der Publikation (title of publication)      publication      cfResPublTitle 
Veröffentlichungsjahr (publication year)      publication      cfResPubl.cfResPublDate 
Zugangsrechte (access rights)      publication      cfDCRightsMMAccessRights** 
hat Dokumenttyp (has document type)      publication      cfResPubl_Class* 
hat Name der Quelle (has name of source)      publication      not modeled 
hat Fach (has discipline)      publication      cfResPubl_Class* 
hat Format (has format)      publication      cfResPubl_Event, cfEvent_Class*, cfResPubl.cfVol, cfResPubl.cfIssue, cfResPubl.cfStartPage, cfResPubl.cfEndPage 
hat Forschungsfeld (has research field)      publication      cfResPubl_Class* 
hat Organisationseinheit (has organizational unit)      publication      cfOrgUnit_ResPubl 
hat Publikationstyp (has publication type)      publication      cfResPubl_Class* 
hat Qualifikationsschrift (has thesis)      publication      cfResPubl_Class* 
hat Schöpfer (has creator)      publication      cfPers_ResPubl 
hat Unterstützung durch (has support by)      publication      not modeled 
hat Forschungsinfrastruktur (has research infrastructure)      publication      cfResPubl_Equip 
hat Verlag (has publisher)      publication      cfOrgUnit_ResPubl 


Area Forschungsinfrastruktur (research infrastructure)
 
RCD Attribute      Entity      Mapping with CERIF 
Beschreibung der Forschungsinfrastruktur (description of research infrastructure)      research infrastructure      cfEquipDescr 
Bezeichnung der Forschungsinfrastruktur (name of research infrastructure)      research infrastructure      cfEquipName 
hat Art der Forschungsinfrastruktur (has kind of research infrastructure)      research infrastructure      cfEquip_Class* 
hat Betreiber (has operator)      research infrastructure      cfOrgUnit_Equip 
hat Organisationseinheit (has organizational unit)      research infrastructure      cfOrgUnit_Equip 
hat Betriebspersonal (has operating staff)      research infrastructure      cfPers_Equip 
hat Koordinator (has coordinator)      research infrastructure      cfOrgUnit_Equip 
hat Nutzung / Nutzungsintensität (has use intensity)      research infrastructure      not modeled 
hat Publikation (has publication)      research infrastructure      cfResPubl_Equip 
hat Typ der Forschungsinfrastruktur (has type of research infrastructure)      research infrastructure      cfEquip_Class* 
hat Zugangsart (has type of access)      research infrastructure      cfEquip_Class* 


Hints:

  • The RCD basic data model includes the definition of objects as well as their attributes (object-specific attributes) and relationships with one another (linkage and assignment attributes, e.g. "has ...").

  • * In CERIF version 1.6, this linking entity is used to store different classification schemes and terms. The CERIF classification scheme is available as an Excel file in version 1.5.

  • ** In CERIF version 1.6, the complete Dublin Core part is deprecated; the concept of federated identifier should be used instead.



  • References:

    [1] Simons, E. (2014). EuroCRIS and CERIF. The Importance of an International Standard Metadata Model for Research Information. SK-CRIS Event (CVTI SR, Bratislava, Apr 2nd, 2014), Bratislava.

    [2] Institut für Forschungsinformation und Qualitätssicherung et al. (2015). Ergebnisbericht zum Projekt "Kerndatensatz Forschung", Berlin, 13.10.2015.

    [3] Quix, C., Riechert, M. (2017). Modelling national research information contexts based on CERIF. Procedia Computer Science, 106, 253-259.

    [4] Lezcano, L., Jörg, B., Lowe, B., & Corson-Rikert, J. (2013). Promoting International Interoperability of Research Information Systems: VIVO and CERIF. J. UCS, 19(12), 1854-1867.

    [5] Walther, T., Hauschke, C., & Kasprzik, A. (2019). The Research Core Dataset (KDSF) in the Linked Data context. Procedia Computer Science, 146, 29-38.




    Mapping RCD with the CERIF Data Model generated by KDSF Helpdesk