Studying Linked Data Accessibility Healthiness for the Long Tail of the Data Web
9th Workshop on Managing the Evolution and Preservation of the Data Web
Futher information: https://ceur-ws.org/Vol-3565/MEPDaW2023-paper2.pdf
In this paper, we explore the accessibility healthiness of Linked Data within the context of the Data Web, focusing on the long tail of data sources. Unlike the traditional web, Linked Data lacks a driving infrastructure to enhance accessibility, leading to negative impacts on data consumers, adoption, and the creation of large-scale infrastructures. We investigate challenges posed by issues such as link rot, unparseable content, downtime, and timeouts that hinder effective access to Linked Data. The study involves a novel Linked Data client that logs debugging information, providing insights into the efficiency and effectiveness of accessing Linked Data. The research also includes discussions on the methods and approach taken, IRI identity mismatch handling, crawling results, and Linked Data parsing statistics. Through extensive analysis of HTTP response status codes and accessibility issues, the paper quantifies common problems but also proposes methods for enhancing Linked Data accessibility in order to retrieve consistent sub-graphs from the Data Web.