Our theory is falling way behind our technology!
The good news: a distributed knowledge-base that describes hundreds of millions of items through tens of billions of relations between them, classifying them into hundreds of thousands of different classes, hosted on a web of thousands of different servers across the world, with fully distributed access and open to contributions from anybody. A knowledge-base on this scale, of this size and of such broad coverage would have been unthinkable 15 years ago, but it has now become reality under a variety of names such as the Semantic Web, the Linked Open Data cloud, or the Web of Data.
The bad news: despite this success, we actually understand very little of the structure of the Web of Data. Its formal meaning is specified in logic, but with its scale, context dependency and dynamics, the Web of Data has outgrown its traditional model-theoretic semantics. Is the meaning of a logical statement (an edge in the graph) dependent on the cluster (“context”) in which it appears? Does a more densely connected concept (node) contain more information? Is the path length between two nodes related to their semantic distance? Properties such as clustering, connectivity and path length are not described, much less explained by model-theoretic semantics. Do such properties contribute to the meaning of a knowledge graph?
To properly understand the structure and meaning of knowledge graphs, we should no longer treat knowledge graphs as (only) a set of logical statements, but treat them properly as a graph. But how to do this is far from clear. In this talk, we’ll report on some of our early results on some of these questions, but we’ll ask many more questions for which we don’t have answers yet.
Network Institute
VU University Amsterdam
Frank van Harmelen
Frank van Harmelen is professor in Knowledge Representation and Reasoning at the VU University Amsterdam. He has been involved in the Semantic Web research programme since it’s inception in the late ’90s. He is one of the co-designers of the W3C ontology representation language OWL, and was involved in the design of Sesame, one of the most widely used RDF repositories world wide. He is co-author of the Semantic Web Primer, the first textbook on Semantic Web technologies, now translated into 5 languages. He was scientific director of the Large Knowledge Collider (LarKC), which aimed to build a platform for very large scale distributed reasoning. Besides research into the fundamental questions such as inconsistency, scalability, heterogeneity, and dynamicity, he is also involved in a wide variety of applications of semantic technologies, among others in medicine, the pharmaceutical industry, scientific publishing and e-science. His work on the Sesame triplestore received the 2012 “ISWC 10 year impact award”, and he was elected member of the European Academy of Science in 2014.