Skip to main content

User account menu

  • Log in
DBS-Logo

Database Group Leipzig

within the department of computer science

ScaDS-Logo Logo of the University of Leipzig

Main navigation

  • Home
  • Study
    • Exams
      • Hinweise zu Klausuren
    • Courses
      • Current
    • Modules
    • LOTS-Training
    • Abschlussarbeiten
    • Masterstudiengang Data Science
    • Oberseminare
    • Problemseminare
    • Top-Studierende
  • Research
    • Projects
      • Benchmark datasets for entity resolution
      • FAMER
      • HyGraph
      • Privacy-Preserving Record Linkage
      • GRADOOP
    • Publications
    • Prototypes
    • Annual reports
    • Cooperations
    • Graduations
    • Colloquia
    • Conferences
  • Team
    • Erhard Rahm
    • Member
    • Former employees
    • Associated members
    • Gallery

Scalable privacy-preserving linking of multiple databases using Counting Bloom filters

Breadcrumb

  • Home
  • Research
  • Publications
  • Scalable privacy-preserving linking of multiple databases using Counting Bloom filters

Vatsalan, D. ; Christen, P. ; Rahm, E.

Scalable privacy-preserving linking of multiple databases using Counting Bloom filters

Proc ICDM workshop on Privacy and Discrimination in Data Mining (PDDM)

2016 / 12

Andere

Abstract

The integration, mining, and analysis of person-specific data can provide enormous opportunities for organiza-tions, governments, and researchers to leverage today’s massive data collections. However, the use of personal or otherwise sen-sitive data also raises concerns about the privacy, confidentiality, and potential discrimination of people. Privacy-preserving record linkage (PPRL) is a growing research area that aims at inte-grating sensitive information from multiple disparate databases held by different organizations while preserving the privacy of the individuals in these databases by not revealing their iden-tities and thereby preventing discrimination. PPRL approaches are increasingly required in many real-world application areas ranging from healthcare to national security. Previous approaches to PPRL have mostly focused on linking only two databases. Scaling PPRL to several databases is an open challenge since privacy threats as well as the computation and communication costs increase significantly with the number of databases involved. We thus propose a new encoding method of sensitive data based on Counting Bloom Filters (CBF) to improve privacy for multi-party PPRL (MP-PPRL). We investigate optimizations to reduce computation and communication costs for CBF-based MP-PPRL. Our empirical evaluation with real datasets demonstrates the viability of our approach in terms of scalability, linkage quality, and privacy.

Recent publications

  • 2025 / 9: Generating Semantically Enriched Mobility Data from Travel Diaries
  • 2025 / 8: Slice it up: Unmasking User Identities in Smartwatch Health Data
  • 2025 / 7: MPGT: Multimodal Physics-Constrained Graph Transformer Learning for Hybrid Digital Twins
  • 2025 / 6: Leveraging foundation models and goal-dependent annotations for automated cell confluence assessment
  • 2025 / 6: SecUREmatch: Integrating Clerical Review in Privacy-Preserving Record Linkage

Footer menu

  • Directions
  • Contact
  • Impressum