Skip to main content

User account menu

  • Log in
DBS-Logo

Database Group Leipzig

within the department of computer science

ScaDS-Logo Logo of the University of Leipzig

Main navigation

  • Home
  • Study
    • Exams
      • Hinweise zu Klausuren
    • Courses
      • Current
    • Modules
    • LOTS-Training
    • Abschlussarbeiten
    • Masterstudiengang Data Science
    • Oberseminare
    • Problemseminare
    • Top-Studierende
  • Research
    • Projects
      • Benchmark datasets for entity resolution
      • FAMER
      • HyGraph
      • Privacy-Preserving Record Linkage
      • GRADOOP
    • Publications
    • Prototypes
    • Annual reports
    • Cooperations
    • Graduations
    • Colloquia
    • Conferences
  • Team
    • Erhard Rahm
    • Member
    • Former employees
    • Associated members
    • Gallery

How to identify the roots of broad research topics and fields? The introduction of RPYS sampling

Breadcrumb

  • Home
  • How to identify the roots of broad research topics and fields? The introduction of RPYS sampling

Haunschild, R. ; Marx, W. ; Thor, A. ; Bornmann, L.

How to identify the roots of broad research topics and fields? The introduction of RPYS sampling

Journal of Information Science

2019 / 04

Andere

Futher information: https://doi.org/10.1177/0165551519837175

Abstract

<b>How to identify the roots of broad research topics and fields? The introduction of RPYS sampling using the example of climate change research</b>

Since the introduction of the reference publication year spectroscopy (RPYS) method and the corresponding programme CRExplorer,
many studies have been published revealing the historical roots of topics, fields and researchers. The application of the method was restricted up to now by the available memory of the computer used for running the CRExplorer. Thus, many users could not perform RPYS for broader research fields or topics. In this study, we present various sampling methods to solve this problem: random, systematic and cluster sampling. We introduce the script language of the CRExplorer that can be used to draw many samples from the population data set. Based on a large data set of publications from climate change research, we compare RPYS results using population data with RPYS results using different sampling techniques. From our comparison with the full RPYS (population spectrogram), we conclude that the cluster sampling performs worst and the systematic sampling performs best. The random sampling also performs very well but not as well as the systematic sampling. The study therefore demonstrates the fruitfulness of the sampling approach for applying RPYS.

Recent publications

  • 2025 / 9: Generating Semantically Enriched Mobility Data from Travel Diaries
  • 2025 / 8: Slice it up: Unmasking User Identities in Smartwatch Health Data
  • 2025 / 7: MPGT: Multimodal Physics-Constrained Graph Transformer Learning for Hybrid Digital Twins
  • 2025 / 6: Leveraging foundation models and goal-dependent annotations for automated cell confluence assessment
  • 2025 / 6: SecUREmatch: Integrating Clerical Review in Privacy-Preserving Record Linkage

Footer menu

  • Directions
  • Contact
  • Impressum