Junghanns, M. ; Petermann, A. ; Teichmann, N. ; Gomez, K. ; Rahm, E.

Analyzing Extended Property Graphs with Apache Flink

Proc. Int. SIGMOD workshop on Network Data Analytics (NDA)

2016 / 07

Paper

Abstract

Graphs are an intuitive way to model complex relationships between real world data objects. Thus, graph analytics plays an important role in research and industry. As graphs are often heterogeneous in terms of reflected domain data, their representation requires an expressive data model including the abstraction of graph collections, for example, to analyze communities inside a social network. Further on, answering complex analytical questions about such graphs entails combining multiple analytical operations. To satisfy these requirements, we developed the Extended Property Graph Model. Our model is semantically rich, schema-free and supports multiple distinct graphs. Based on this representation, it provides declarative and combinable operators to analyze both single graphs and graph collections. Our current implementation is based on the distributed dataflow framework Apache Flink. We present the results of a first experimental study showing the scalability of our implementation on social network data with up to 11 billion edges.