Dataset Characteristics Identification for Federated SPARQL Query

Nur Aini Rakhmawati, Lutfi Nur Fadzilah


Nowadays, the amount of data published in the RDF format is increasing. Federated SPARQL query engines that can query from multiple distributed SPARQL endpoints have been developed recently. A federated query engine usually has different performance compared to the others. One of the factors that affect the performance of the query engine is the characteristic of the accessed RDF dataset, such as the number of triples, the number of classes, the number of properties, the number of subjects, the number of entities, the number of objects, and the spreading factor of a dataset. The aim of this work is to identify the characteristic of RDF dataset and create a query set for evaluating a federated engine.  The study was conducted by identifying 16 datasets that used by ten research papers in Linked Data area.


Federated SPARQL query, dataset, benchmark

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.