Skip to main content

User account menu

  • Log in
DBS-Logo

Database Group Leipzig

within the department of computer science

ScaDS-Logo Logo of the University of Leipzig

Main navigation

  • Home
  • Study
    • Exams
      • Hinweise zu Klausuren
    • Courses
      • Current
    • Modules
    • LOTS-Training
    • Abschlussarbeiten
    • Masterstudiengang Data Science
    • Oberseminare
    • Problemseminare
    • Top-Studierende
  • Research
    • Projects
      • Benchmark datasets for entity resolution
      • FAMER
      • HyGraph
      • Privacy-Preserving Record Linkage
      • GRADOOP
    • Publications
    • Prototypes
    • Annual reports
    • Cooperations
    • Graduations
    • Colloquia
    • Conferences
  • Team
    • Erhard Rahm
    • Member
    • Former employees
    • Associated members
    • Gallery

GRADOOP: Scalable Graph Data Management and Analytics with Hadoop

Breadcrumb

  • Home
  • GRADOOP: Scalable Graph Data Management and Analytics with Hadoop

Duration

Since 2015

Description

gradoop

Processing highly connected data as graphs becomes increasingly essential in many domains. Prominent examples are social networks, e.g., Facebook and Twitter, as well as information networks like the World Wide Web or biological networks. One crucial similarity of these domain-specific data is their inherent graph structure, which makes them eligible for analytics using graph algorithms. Besides that, the datasets share two more similarities: they are huge in size, making it hard or even impossible to process them on a single machine and they grow over time, which classifies them as temporal graphs. Intending to analyze these large-scale, temporal datasets, we started developing a framework called “Gradoop” (Graph Analytics on Hadoop®) with the following three main objectives:

  1. developing a temporal graph data model incl. operators for the definition of analytical pipelines
  2. data integration of heterogeneous source systems into an integrated graph and
  3. efficient data distribution/replication to optimize the execution of distributed graph operators.

Our prototype is built on top of the distributed dataflow framework Apache Flink™. The data model has been designed, and the operators have been implemented. A first use case is the BIIIG project for graph analytics in business information networks. In our ongoing work, we will look into different methods of operator tuning depending on the underlying dataflow system.

Students

  • Philip Fritzsche
  • Timo Adameit
  • Lucas Schons

Project members

  • Prof. Dr. Erhard Rahm
  • Junghanns, Martin
  • Petermann, André
  • Dr. Christopher Rost
  • Kevin Gomez

Awards

  • Best Demo Award, BTW 2017

Source Code

GitHub

Funding / Cooperation

scadsai

Talks

  • GRADOOP - Scalable Graph Analytics with Apache Flink
  • GRADOOP - Scalable Graph Analytics with Apache Flink
  • Scalable Graph Analytics with GRADOOP and BIIIG
  • Scalable Graph Analytics
  • Skalierbare Graph-basierte Analyse und Business Intelligence
  • (Cypher)-[:ON]->(ApacheFlink)<-[:USING]-(Gradoop)
  • From Shopping Baskets to Structural Patterns
  • Scalable Graph Data Analytics with GRADOOP
  • Gut vernetzt: Skalierbares Graph Mining für Business Intelligence
  • Scalable Graph Analytics with GRADOOP
  • Distributed Graph Analytics with GRADOOP

Publikationen (31)

Dateien Cover Beschreibung Jahr
Distributed temporal graph analytics with GRADOOP
Rost, C. ; Gomez, K. ; Täschner, M. ; Fritzsche, P. ; Schons, L. ; Christ, L. ; Adameit, T. ; Junghanns, M. ; Rahm, E.
VLDB Journal 2021 Special Issue Paper
2021 / 5
Exploration and Analysis of Temporal Property Graphs
Rost, C. ; Gomez, K. ; Fritzsche, P. ; Thor, A. ; Rahm, E.
24th International Conference on Extending Database Technology (EDBT)
2021 / 3
Graph Sampling with Distributed In-Memory Dataflow Systems
Gomez, K. ; Täschner, M. ; Rostami, M. ; Rost, C. ; Rahm, E.
Proc. Datenbanksysteme für Business, Technologie und Web (BTW) 2021
2021 / 3
Analyzing Temporal Graphs with Gradoop
Rost, C. ; Thor, A. ; Rahm, E.
Datenbank-Spektrum 19(3)
2019 / 11
Distributed Graph Sampling with In-Memory Dataflow Systems
Gomez, K. ; Täschner, M. ; Rostami, M. ; Rost, C. ; Rahm, E.
Techn. Report, Univ. of Leipzig, arXiv:1910.04493, Oct 2019
2019 / 10
Evolution Analysis of Large Graphs with Gradoop
Rost, C. ; Thor, A. ; Fritzsche, P. ; Gomez, K. ; Rahm, E.
Proc. of Intl. Workshop on Advances in managing and mining large evolving graphs (LEG@ECML-PKDD)
2019 / 9
Graph data transformations in GRADOOP
Kricke, M. ; Peukert, E. ; Rahm, E.
Proc. BTW, March 2019
2019 / 3
Temporal Graph Analysis using Gradoop
Rost, C. ; Thor, A. ; Rahm, E.
Proc. BTW workshops, LNI
2019 / 3
BIGGR: Bringing Gradoop to Applications
Rostami, M. ; Kricke, M. ; Peukert, E. ; Kühne, S. ; Wilke, M. ; Dienst, S. ; Rahm, E.
Datenbank-Spektrum
2019 / 3
On Pattern Mining in Graph Data to Support Decision-Making
Petermann, A.
Dissertation, Univ. Leipzig
2019

Pagination

  • Current page 1
  • Page 2
  • Page 3
  • Page 4
  • Next page Next ›
  • Last page Last »

Recent publications

  • 2025 / 9: Generating Semantically Enriched Mobility Data from Travel Diaries
  • 2025 / 8: Slice it up: Unmasking User Identities in Smartwatch Health Data
  • 2025 / 6: SecUREmatch: Integrating Clerical Review in Privacy-Preserving Record Linkage
  • 2025 / 6: Leveraging foundation models and goal-dependent annotations for automated cell confluence assessment
  • 2025 / 5: Federated Learning With Individualized Privacy Through Client Sampling

Footer menu

  • Directions
  • Contact
  • Impressum