Exploring Linked Open Data with FactForge

The Web is full of data

The Web is full of data. Lots of data. Data of all kinds and flavours, huge amounts of which exist freely and openly, waiting to be explored and made sense of. And we, we are full of expectations this data abundance to forge data-driven research, intelligent search, automated content processing and so much more.

The bad news is that a great part of these data are not connected. They exist in sheer collections of datasets, unrelated to each other. Such data are not еasy to browse, let alone navigate through questions, which ultimately is what the our expectations are all about.

Unarmed with the proper tools for integrating them, when we try to make sense of these data, we resemble the student in the statue of Fons Sapientiae (Source of Wisdom) in Leuven, Belgium. He pours “wisdom“ in his head while reading about the formula of happiness from a book in front of him. Just like him, we know there are exciting opportunities that data flows on the web can pour into our business, but at the same time we’ve never taken advantage of them, we are only aware of them theoretically.

The good news is that there is a growing amount of interrelated datasets on the Web – the so called Linked Open Data (LOD), which can allow us to tap into the mind-boggling potential the Web opens before us and to experience the benefits of the data abundance on the Web. Linked Open Data is the interconnected portion of data on the web that can make a big difference – the starry data on the web of data.

Linked Open Data

As  Tom Heath put it so clearly in Linked Data? Web of Data? Semantic Web? WTF?:

When people started weaving individual bits of data together with RDF triples (that expressed the relationship between these bits of data), we saw the emergence of a Web of data. Linked Data is no more complex than this – connecting related data across the Web using URIs, and RDF. Of course, there are many ways to have linked data, but in common usage Linked Data refers to the principles set out by Tim Berners-Lee in 2006.

This very portion of growingly interconnected data on the web is our way out of the data confusion and into the data abundance. And it is being able to browse these data that will turn our expectations into real experiences. To get back to the Fons Sapientiae train of thought, we will not only be reading about the formula of happiness, we will be living it.

FactForge – Open Data and News about People Organizations and Locations

Meet FactForge: A Convenient Entry Point to the World of Linked Data

FactForge is an open-access platform that you can use as a convenient entry point to the web of interconnected data. It is Ontotext’s Web application based on GraphDB, that allows everyone to explore some of the most central Linked Open Data datasets alongside and in connection to news and newly published data such as the Panama papers and Trump Data World (coming soon).

Formally, FactForge is a knowledge graph compiled of open data and news metadata. Offering access to more than a billion of facts, FactForge includes datasets from DBPedia (the structured version of the Wikipedia encyclopedia), Geonames (a worldwide geographical database containing over 10 million geographical names), Wordnet (a semantic dictionary for English where words “are grouped into sets of cognitive synonyms”), WorldFacts (a dataset about countries, languages, currencies and other related information) and more. You can check the entire list of datasets in FactForge – Open Data and News about People, Organizations and Locations.

What makes FactForge different from other LOD services is that it is constantly being updated with a live stream of news metadata, linking the articles to entities and concepts, tagged by Ontotext’s Publishing platform. To add some technical detail, the platform acts as a semantic search engine that demonstrates what GraphDB, Ontotext’s Smart Data maker can do with data. FactForge uses the Financial Industry Business Ontology (FIBO) as an upper-level ontology and works with data available as RDF graph and stored in GraphDB.

 

Repository overview

 

Engine: GraphDB Site: http://ontotext.com/factforge
Inference Ruleset: factforge.pie Number of statements: 1,552,635,261
Number of Entities : 332,509,188 Number of Bnodes: 22,670,702
Number of Explicit Statements: 1,239,333,784 Number of Implicit Statements: 313,301,477

 

Things to Do With FactForge

A free, publicly available service, FactForge can serve you as a useful index and entry point to the LOD cloud. With it you can search entities by name to find resources and facts based on the semantics of the data.

Just like web search engines index web pages and facilitate their usage, FactForge facilitates the access to data represented as RDF graph. It features some sample queries that demonstrate its unique capabilities for media monitoring of related entities as well as analysis of industry trends and company control patterns.

Being a huge collection of general purpose data, FactForge allows you to tap into the power of data integration and:

  • mine and navigate large scale general knowledge datasets, constructing your own queries;
  • find relevant answers to complex questions (for example: which European party families are most often mentioned alongside banks?);
  • retrieve results in a conveniently formatted table with the option to download results in various formats (SPARQL/XML, JSON, etc).

The platform features sample SPARQL queries, which you can modify easily to fit the things you are interested in. For instance, there is a query “People and organizations related to Google”, which you can use to find people and organizations related to any other company. You just replace Google by typing the new name, e.g. dbr:Hew, and then hit Ctrl-Space to make us of the auto-complete. You can also choose whether you want to see only the explicit statements from the results for your query or also the implicit facts, inferred when interpreting the ontologies and the datasets with respect to the semantics of the data.

Factforge SPARQL

Let us know in the comments if you have any questions you need answered and we will turn them into SPARQL queries.

By providing access to large amounts of open datasets and integrating data from them, FactForge creates a user-friendly entry point into the LOD cloud, enabling users to see more connections and broaden their understanding of certain topics, people, organizations and locations. And it is in the connectivity and data integration that the real opportunities for better experience across the Web of Linked Data lie.

Learn more about FactForge directly from Atanas Kiryakov speaking at his webinar FactForge Debuts: Trump World Data and Instant Ranking of Industry Leaders.

Teodora Petkova

Teodora Petkova

Teodora is a philologist fascinated by the metamorphoses of text on the Web. Curious about our networked lives, she explores how the Semantic Web vision unfolds, transforming the possibilities of the written word.
Teodora Petkova

Related Posts

  • Featured Image

    Weaving Data Into Texts: The Value of Semantic Annotation

    Semantic annotation is about weaving data into textual sources. In semantically annotated texts, certain words (denoting things, people, locations, organizations, etc) are linked to data – that is, to context and references that can be processed by an algorithm.

  • Datathon Case Overview: Revealing Hidden Links Through Open Data

    For the first Datathon in Central and Eastern Europe, the Data Science Society team and the partner companies provided various business cases in the field of data science, offering challenges to the participants who set out to solve them in less than 48 hours. At the end of the event, there were 16 teams presenting their results after a weekend of work.

  • Featured Image

    What is GraphDB and how can it help you run a smart data-driven business?

    Learn about GraphDB through the solutions it offers in a simple and easy to understand way. In this presentation we have unpacked GraphDB for you, using as little tech talk as possible. Read on and see what Ontotext’s semantic graph database has to do with pasta making.

Back to top