Annotation graph datasets are a natural representation of scientific knowledge. They are common in the life sciences and health sciences, where concepts of interest such as genes, proteins or clinical trials are annotated with controlled vocabulary terms from multiple ontologies. Scientists are interested in analyzing or mining these annotations, in synergy with the literature, to discover patterns.
We present a tool, PAnG (Patterns in Annotation Graphs), that is based on a complementary methodology of graph summarization and dense subgraphs. The elements of a graph summary correspond to a pattern. The visualization of the summary is meaningful to scientists and can provide an explanation of the underlying knowledge. Scientists can use PAnG to analyze a dataset of interest to develop hypotheses, or to explore a dataset.