Skip to main content

User account menu

  • Log in
DBS-Logo

Database Group Leipzig

within the department of computer science

ScaDS-Logo Logo of the University of Leipzig

Main navigation

  • Home
  • Study
    • Exams
      • Hinweise zu Klausuren
    • Courses
      • Current
    • Modules
    • LOTS-Training
    • Abschlussarbeiten
    • Masterstudiengang Data Science
    • Oberseminare
    • Problemseminare
    • Top-Studierende
  • Research
    • Projects
      • Benchmark datasets for entity resolution
      • FAMER
      • HyGraph
      • Privacy-Preserving Record Linkage
      • GRADOOP
    • Publications
    • Prototypes
    • Annual reports
    • Cooperations
    • Graduations
    • Colloquia
    • Conferences
  • Team
    • Erhard Rahm
    • Member
    • Former employees
    • Associated members
    • Gallery

Analyzing Extended Property Graphs with Apache Flink

Breadcrumb

  • Home
  • Research
  • Publications
  • Analyzing Extended Property Graphs with Apache Flink

Junghanns, M. ; Petermann, A. ; Teichmann, N. ; Gomez, K. ; Rahm, E.

Analyzing Extended Property Graphs with Apache Flink

Proc. Int. SIGMOD workshop on Network Data Analytics (NDA)

2016 / 07

Paper

Abstract

Graphs are an intuitive way to model complex relationships between real world data objects. Thus, graph analytics plays an important role in research and industry. As graphs are often heterogeneous in terms of reflected domain data, their representation requires an expressive data model including the abstraction of graph collections, for example, to analyze communities inside a social network. Further on, answering complex analytical questions about such graphs entails combining multiple analytical operations. To satisfy these requirements, we developed the Extended Property Graph Model. Our model is semantically rich, schema-free and supports multiple distinct graphs. Based on this representation, it provides declarative and combinable operators to analyze both single graphs and graph collections. Our current implementation is based on the distributed dataflow framework Apache Flink. We present the results of a first experimental study showing the scalability of our implementation on social network data with up to 11 billion edges.

Recent publications

  • 2025 / 9: Generating Semantically Enriched Mobility Data from Travel Diaries
  • 2025 / 8: Slice it up: Unmasking User Identities in Smartwatch Health Data
  • 2025 / 6: SecUREmatch: Integrating Clerical Review in Privacy-Preserving Record Linkage
  • 2025 / 6: Leveraging foundation models and goal-dependent annotations for automated cell confluence assessment
  • 2025 / 5: Federated Learning With Individualized Privacy Through Client Sampling

Footer menu

  • Directions
  • Contact
  • Impressum