Dataset Characteristics Identification for Federated SPARQL Query

Nur Aini Rakhmawati; Lutfi Nur Fadzilah

doi:10.15294/sji.v6i1.17258

Dataset Characteristics Identification for Federated SPARQL Query

Nur Aini Rakhmawati⁽¹⁾, Lutfi Nur Fadzilah⁽²⁾,

DOI: https://doi.org/10.15294/sji.v6i1.17258

(1) Institut Teknologi Sepuluh Nopember Surabaya (ITS)
(2) Institut Teknologi Sepuluh Nopember Surabaya (ITS)

Abstract

Nowadays, the amount of data published in the RDF format is increasing. Federated SPARQL query engines that can query from multiple distributed SPARQL endpoints have been developed recently. A federated query engine usually has different performance compared to the others. One of the factors that affect the performance of the query engine is the characteristic of the accessed RDF dataset, such as the number of triples, the number of classes, the number of properties, the number of subjects, the number of entities, the number of objects, and the spreading factor of a dataset. The aim of this work is to identify the characteristic of RDF dataset and create a query set for evaluating a federated engine.Â The study was conducted by identifying 16 datasets that used by ten research papers in Linked Data area.

Keywords

Federated SPARQL query, dataset, benchmark

Full Text:

PDF

References

F. Gandon and G. Schreiber, â€œRDF 1.1 XML Syntax,â€ W3C Recommendation, Feb. 2014, http://www.w3.org/TR/rdf-syntax-grammar/.

S. Harris and A. Seaborne (eds), â€œSPARQL 1.1 query language,â€ W3C, Working Draft, 2013.

O. GoÂ¨rlitz, M. Thimm, and S. Staab, â€œSplodge: Systematic generation of sparql benchmark queries for linked open data,â€ in International Semantic Web Conference (1), 2012, pp. 116â€“132.

A. Schwarte, P. Haase, K. Hose, R. Schenkel, and M. Schmidt, â€œFedx: a federation layer for distributed query processing on linked open data,â€ in Extended Semantic Web Conference. Springer, 2011, pp. 481â€“486.

M. Acosta, M.-E. Vidal, T. Lampo, J. Castillo, and E. Ruckhaus, â€œAnapsid: an adaptive query processing engine for sparql endpoints,â€ The Semantic Webâ€“ISWC 2011, pp. 18â€“34, 2011.

S. Lynden, I. Kojima, A. Matono, and Y. Tanimura, â€œAderis: An adaptive query processor for joining federated sparql endpoints,â€ On the Move to Meaningful Internet Systems: OTM 2011, pp. 808â€“817, 2011.

N. A. Rakhmawati, M. Karnstedt, M. Hausenblas, and S. Decker, â€œOn metrics for measuring fragmentation of federation over sparql endpoints.â€ in WEBIST (1), 2014, pp. 119â€“126.

G. Montoya, M.-E. Vidal, O. Corcho, E. Ruckhaus, and C. Buil-Aranda, â€œBenchmarking federated sparql query engines: Are existing testbeds enough?â€ in International Semantic Web Conference. Springer, 2012, pp. 313â€“324.

M. Schmidt, O. GoÂ¨rlitz, P. Haase, G. Ladwig, A. Schwarte, and T. Tran, â€œFedbench: A benchmark suite for federated semantic data query processing,â€ The Semantic Webâ€“ISWC 2011, pp. 585â€“600, 2011.

H. Wu, T. Fujiwara, Y. Yamamoto, J. Bolleman, and A. Yamaguchi, â€œBiobenchmark toyama 2012: an evaluation of the performance of triple stores on biological data,â€ Journal of biomedical semantics, vol. 5, no. 1, p. 32, 2014.

N. A. Rakhmawati, M. Saleem, S. Lalithsena, and S. Decker, â€œQfed: Query set for federated sparql query benchmark,â€ in Proceedings of the 16th International Conference on Information Integration and Web- based Applications & Services. ACM, 2014, pp. 207â€“211.

M. Saleem, Q. Mehmood, and A.-C. N. Ngomo, â€œFeasible: A feature- based sparql benchmark generation framework,â€ in International Semantic Web Conference. Springer, 2015, pp. 52â€“69.

S. Duan, A. Kementsietsidis, K. Srinivas, and O. Udrea, â€œApples and oranges: a comparison of rdf benchmarks and real rdf datasets,â€ in Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, 2011, pp. 145â€“156.

P. Westphal, C. Stadler, and J. Pool, â€œCountering language attrition with panlex and the web of data,â€ Semantic Web, vol. 6, no. 4, pp. 347â€“353, 2015.

H. Wu, B. Villazon-Terrazas, J. Z. Pan, and J. M. Gomez-Perez, â€œHow redundant is it?-an empirical analysis on linked datasets,â€ in Proceedings of the 5th International Conference on Consuming Linked Data-Volume 1264. CEUR-WS. org, 2014, pp. 97â€“108.

K. M. Endris, S. Faisal, F. Orlandi, S. Auer, and S. Scerri, â€œirap- an interest-based rdf update propagation framework.â€ in International Semantic Web Conference (Posters & Demos), 2015.

A. Adamou, M. dâ€™Aquin, H. Barlow, and S. Brown, â€œLed: curated and crowdsourced linked data on music listening experiences,â€ in Proceedings of the 2014 International Conference on Posters & Demonstrations Track-Volume 1272. CEUR-WS. org, 2014, pp. 93â€“96.

G. de Melo, â€œLexvo. org: Language-related information for the linguis- tic linked data cloud,â€ Semantic Web, vol. 6, no. 4, pp. 393â€“400, 2015.

J. Debattista, S. Auer, and C. Lange, â€œLuzzuâ€“a framework for linked data quality assessment,â€ in Semantic Computing (ICSC), 2016 IEEE Tenth International Conference on. IEEE, 2016, pp. 124â€“131.

J. Baier, D. Daroch, J. L. Reutter, and D. Vrgoc, â€œProperty paths over linked data: Can it be done and how to start?â€ COLD@ ISWC, 2016.

A. Basharat, B. Abro, I. B. Arpinar, and K. Rasheed, â€œSemantic hadith: Leveraging linked data opportunities for islamic knowledge.â€ in LDOW@ WWW, 2016.

K. R. Kurte, S. S. Durbha, R. L. King, N. H. Younan, and R. Vatsavai, â€œSemantics-enabled framework for spatial image information mining of linked earth observation data,â€ IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 10, no. 1, pp. 29â€“44, 2017.

Y. Hu, K. Janowicz, P. Hitzler, and K. Sengupta, â€œThe semantic web journal as linked data.â€ in International Semantic Web Conference (Posters & Demos), 2015.

â€œConvert GeoNames RDF dump format into ntriples,â€Available: https://gist.github.com/baskaufs/54207ab81eee4f9aa468137df5967d30

R. Cyganiak, â€œAn RDF schema and associated documentation for expressing metadata about RDF datasets,â€ available at: https://github.com/cygri/void/blob/master/ archive/google-code-wiki/SPARQLQueriesForStatistics.md. [On- line].

Refbacks

There are currently no refbacks.

Scientific Journal of Informatics (SJI)
p-ISSN 2407-7658 | e-ISSN 2460-0040
Published By Department of Computer Science Universitas Negeri Semarang
Website: https://journal.unnes.ac.id/nju/index.php/sji
Email: [email protected]

This work is licensed under a Creative Commons Attribution 4.0 International License.

Username
Password
Remember me