— Summarizing RDF graphs —
RDF is the W3C's preferred format for representing Semantic Web data. In the line of semistructured graph data models, RDF allows describing large, complex, heterogeneous graphs of interconnected resources. Departing from these models, though, RDF comes equipped with an associated ontology language, namely RDFS, which allows enriching the characterization of a dataset with knowledge about its semantics. The presence of RDFS statements, in turn, leads to implicit RDF data, which is conceptually also part of the RDF database.
We are currently working on the problem of summarizing RDF graph, that is: given an input data graph G, find another graph G' which represents the structural information from G while being many orders of magnitude smaller. Summaries have numerous applications, ranging from simple first-approach GUI for a user to get acquainted with a dataset, to condensed structure that can be used to optimize queries, the way Dataguides were for semistructured OEM data, but with particular care given to RDFS and implicit data. In the talk, we will present a set of interesting requirements which RDF summaries should satisfy, outline a few alternative summary models, and present preliminary evaluation results.
Joint work with Sejla Cebiric and Francois Goasdoue.
Ioana Manolescu is a senior researcher at Inria Saclay, and the lead of the joint team OAK between Inria and Universite de Paris Sud in Orsay, France. She has been a post-doctoral fellow and visiting professor at Politecnico di Milano and has obtained a PhD in 2001 from Universite de Versailles Saint-Quentin and Inria Rocquencourt. Her main research interests algebraic and storage optimizations for semistructured data and in particular data models for the Semantic Web, novel data models and languages for complex data management, data models and algorithms for fact-checking, and distributed architectures for complex large data.
— Curious and Self-designing Systems: Towards Easy to use Data Systems Tailored for Exploration —
How far away are we from a future where a data management system sits in the critical path of everything we do? Already today we need to go through a data system in order to do several basic tasks, e.g., to pay at the grocery store, to book a flight, to find out where our friends are and even to get coffee. Businesses and sciences are increasingly recognizing the value of storing and analyzing vast amounts of data. Other than the expected path towards an exploding number of data-driven businesses and scientific scenarios in the next few years, in this talk we also envision a future where data becomes readily available and its power can be harnessed by everyone. What both scenarios have in common is a need for new kinds of data systems which are tailored for data exploration, which are easy to use, and which can quickly absorb and adjust to new data and access patterns on-the-fly. We will discuss this vision and some of our recent efforts towards self-designing systems as well as "curious" systems tailored for automated exploration.
Stratos Idreos is an assistant professor of Computer Science at Harvard University where he leads DASlab, the Data Systems Laboratory@Harvard SEAS. Stratos works on data systems architectures with emphasis on designing systems for big data exploration. For his doctoral work on Database Cracking, Stratos won the 2011 ACM SIGMOD Jim Gray Doctoral Dissertation award and the 2011 ERCIM Cor Baayen award as
from the European Research Council on Informatics and Mathematics. In 2010 he was awarded the IBM zEnterpise System Recognition Award by IBM Research, and in 2011 he won the VLDB Challenges and Visions best paper award. In 2015 he received an NSF CAREER award and was awarded the 2015 IEEE TCDE Early Career Award from the IEEE Technical Committee on Data Engineering.