Duration
Description
Processing highly connected data as graphs becomes increasingly essential in many domains. Prominent examples are social networks, e.g., Facebook and Twitter, as well as information networks like the World Wide Web or biological networks. One crucial similarity of these domain-specific data is their inherent graph structure, which makes them eligible for analytics using graph algorithms. Besides that, the datasets share two more similarities: they are huge in size, making it hard or even impossible to process them on a single machine and they grow over time, which classifies them as temporal graphs. Intending to analyze these large-scale, temporal datasets, we started developing a framework called “Gradoop” (Graph Analytics on Hadoop®) with the following three main objectives:
- developing a temporal graph data model incl. operators for the definition of analytical pipelines
- data integration of heterogeneous source systems into an integrated graph and
- efficient data distribution/replication to optimize the execution of distributed graph operators.
Our prototype is built on top of the distributed dataflow framework Apache Flink™. The data model has been designed, and the operators have been implemented. A first use case is the BIIIG project for graph analytics in business information networks. In our ongoing work, we will look into different methods of operator tuning depending on the underlying dataflow system.
Students
- Philip Fritzsche
- Timo Adameit
- Lucas Schons
Awards
Source Code
Talks
- GRADOOP - Scalable Graph Analytics with Apache Flink
- GRADOOP - Scalable Graph Analytics with Apache Flink
- Scalable Graph Analytics with GRADOOP and BIIIG
- Scalable Graph Analytics
- Skalierbare Graph-basierte Analyse und Business Intelligence
- (Cypher)-[:ON]->(ApacheFlink)<-[:USING]-(Gradoop)
- From Shopping Baskets to Structural Patterns
- Scalable Graph Data Analytics with GRADOOP
- Gut vernetzt: Skalierbares Graph Mining für Business Intelligence
- Scalable Graph Analytics with GRADOOP
- Distributed Graph Analytics with GRADOOP
Publikationen (31)
Dateien | Cover | Beschreibung | Jahr |
---|---|---|---|
Junghanns, M.
; Petermann, A.
; Rahm, E.
Proc. Datenbanksysteme für Business, Technologie und Web (BTW) 2017
|
2017 / 3 | ||
Kemper, S.
; Petermann, A.
; Junghanns, M.
Proc. Datenbanksysteme für Business, Technologie und Web (BTW) 2017 (Workshops)
|
2017 / 3 | ||
Petermann, A.
; Junghanns, M.
; Kemper, S.
; Gomez, K.
; Teichmann, N.
; Rahm, E.
Proc. ICDM 2016 (Demo paper)
|
2016 / 12 | ||
2016 / 10 | |||
Petermann, A.
; Junghanns, M.
it - Information Technology, Special Issue: Big Data Analytics, Vol. 58 (4), 2016, pp. 166–175
|
2016 / 8 | ||
Junghanns, M.
; Petermann, A.
; Teichmann, N.
; Gomez, K.
; Rahm, E.
Proc. Int. SIGMOD workshop on Network Data Analytics (NDA)
|
2016 / 7 | ||
Junghanns, M.
; Petermann, A.
; Gomez, K.
; Rahm, E.
Techn. Report, Univ. of Leipzig, arXiv:1506.00548, June 2015
|
2015 / 6 | ||
Rahm, E.
Proc. GI-Workshop Grundlagen von Datenbanksystemen (GvDB), Gommern, May 2015 (Invited Talk)
|
2015 / 5 | ||
2014 / 9 | |||
Petermann, A.
; Junghanns, M.
; Müller, R.
; Rahm, E.
5th Workshop on Big Data Benchmarking (WBDB 2014), LNCS 8991, 2015
|
2014 / 8 |