How the Self-Service Semantic Suite Is Helping a Linked Data MOOC

An “Introduction to Linked Data & the Semantic Web” MOOC

On Apr 11th the University of Southampton started a new MOOC called “Introduction to Linked Data and the Semantic Web”. The MOOC is free to attend, and certificates will be available upon successful completion of the course.

Introduction to Linked Data and the Semantic Web Online Course

The course covers important topics, such as:

  • Understanding what Linked Data is,
  • Understanding the importance of Linked Data and its application in various domains and use cases,
  • Getting practical experience with Linked Data, through step-by-step guides and examples.

The main tutor for the MOOC is Dr. Elena Simperl, a researcher & lecturer at the University of Southampton, UK, working in the area of Linked Data, Open Data, Semantic Web and Data Science.

Learning SPARQL with the MusicBrainz Dataset

As part of the Linked Data and Semantic Web MOOC, students will have a chance to learn SPARQL, the query language for Linked Data and semantic graph databases. Several practical examples are based upon the MusicBrainz dataset, the “open music encyclopedia”, providing information about a large number of music artists and albums.

Students will be able to experiment with queries such as “Titles & durations of Beatles tracks”.

Beatles Tracks by Title_QueryBeatles Tracks by Duaration_Query

To make it easy for students to get started with the SPARQL queries for MusicBrainz, we have configured a version of the GraphDB Workbench as a web app, so that pre-defined query templates are readily available for students to experiment with, and simple chart visualisations can be instantly created over the query results.

Pre-defined Query TemplatesChart Visualization Based on Query

The Role of the Self-Service Semantic Suite (S4)

The Self-Service Semantic Suite (S4) platform by Ontotext provides developers with instant access to services for text analytics, knowledge graphs and semantic graph database-as-a-service in the Cloud. The main capabilities of the S4 platform include:

  • Text analytics of unstructured content so that entities and the relations between them can be extracted from news, biomedical content and tweets,
  • Semantic graph database-as-a-service so that developers can add, interlink and query semantic facts from open knowledge graphs or extracted from text content via the text analytics services. The semantic graph DBaaS is based on Ontotext GraphDB triplestore (RDF database) and databases are instantly available to developers. The S4 platform takes care of DBA aspects such as configuration, operation, maintenance and scaling,
  • Access to large open knowledge graphs that enhance the semantic analysis process. The knowledge graph integrates different open knowledge graphs such as DBpedia, Wikidata & GeoNames.

The “Introduction to Linked Data and the Semantic Web” MOOC has specific requirements related to the SPARQL samples and exercises:

  • Access to a relatively large knowledge graph (almost 500 million explicit + inferred triples in the MusicBrainz dataset),
  • A mix of simple and complex queries,
  • Concurrent access by a potentially large number of students (100s at the same time),
  • High availability of the database storing the samples & exercises data.

The Self-Service Semantic Suite (S4) by Ontotext was a natural fit for the requirements of the MOOC because:

  • It provides fully managed and hosted semantic graph database-as-a-service instances (RDF triplestores), so that the tutors from the University of Southampton did not need to setup & maintain the databases & the underlying infrastructure on their own.
  • The DBaaS is designed for high availability, so that it can be instantly re-deployed and running on a new node in the Cloud, if the original server experiences problems, without interrupting the work of the uses and applications connected to the database.
  • The dataset is replicated on multiple DBaaS instances, so that the query load of a large number of users is load-balanced for improved throughput and response time.
  • New hardware resources can be automatically and instantly provisioned in order to adapt to a growing use of the system: new database nodes can be added, so that queries can be load-balanced among more database servers; new nodes hosting the web app with the SPARQL query UI can be instantly provisioned and activated in order to accommodate more concurrent users.

An additional benefit of using the S4 platform for the MOOC is that it provides free semantic graph database instances for up to 10 million triples, so that students who learn about Linked Data and SPARQL can continue improving their skills or start building prototypes without the need to maintain their own semantic graph database (RDF triplestore).

Next Steps

If you are new to Linked Data and the Semantic Web, check out the new MOOC by the University of Southampton, learn about SPARQL, get your own and free semantic graph database on S4, and start experimenting with Smart Data prototypes!

Marin Dimitrov

Marin Dimitrov

CTO at Ontotext
As the technological captain of Ontotext, he is leading the company on the right tech route and reserving our spot on the map of the world. His sharp mind can explain complex things in a simple way, making him an invaluable resource in semantics. Marin is a frequent speaker on semantic conferences and open data meetups at various technology related events.
Marin Dimitrov

Related Posts

  • Featured Image

    Weaving Data Into Texts: The Value of Semantic Annotation

    Semantic annotation is about weaving data into textual sources. In semantically annotated texts, certain words (denoting things, people, locations, organizations, etc) are linked to data – that is, to context and references that can be processed by an algorithm.

  • Featured Image

    The New Cache on the Block: A Caching Strategy in GraphDB To Better Utilize Memory

    The ability to seamlessly integrate datasets and the speed at which this can be done are mission critical when it comes to working with big data. The new caching system of GraphDB is better, faster and smarter and solves the issues of the old caching strategy in GraphDB.

  • Datathon Case Overview: Revealing Hidden Links Through Open Data

    For the first Datathon in Central and Eastern Europe, the Data Science Society team and the partner companies provided various business cases in the field of data science, offering challenges to the participants who set out to solve them in less than 48 hours. At the end of the event, there were 16 teams presenting their results after a weekend of work.

Back to top