How the Self-Service Semantic Suite Is Helping a Linked Data MOOC

An “Introduction to Linked Data & the Semantic Web” MOOC

On Apr 11th the University of Southampton started a new MOOC called “Introduction to Linked Data and the Semantic Web”. The MOOC is free to attend, and certificates will be available upon successful completion of the course.

Introduction to Linked Data and the Semantic Web Online Course

The course covers important topics, such as:

  • Understanding what Linked Data is,
  • Understanding the importance of Linked Data and its application in various domains and use cases,
  • Getting practical experience with Linked Data, through step-by-step guides and examples.

The main tutor for the MOOC is Dr. Elena Simperl, a researcher & lecturer at the University of Southampton, UK, working in the area of Linked Data, Open Data, Semantic Web and Data Science.

Learning SPARQL with the MusicBrainz Dataset

As part of the Linked Data and Semantic Web MOOC, students will have a chance to learn SPARQL, the query language for Linked Data and semantic graph databases. Several practical examples are based upon the MusicBrainz dataset, the “open music encyclopedia”, providing information about a large number of music artists and albums.

Students will be able to experiment with queries such as “Titles & durations of Beatles tracks”.

Beatles Tracks by Title_QueryBeatles Tracks by Duaration_Query

To make it easy for students to get started with the SPARQL queries for MusicBrainz, we have configured a version of the GraphDB Workbench as a web app, so that pre-defined query templates are readily available for students to experiment with, and simple chart visualisations can be instantly created over the query results.

Pre-defined Query TemplatesChart Visualization Based on Query

The Role of the Self-Service Semantic Suite (S4)

The Self-Service Semantic Suite (S4) platform by Ontotext provides developers with instant access to services for text analytics, knowledge graphs and semantic graph database-as-a-service in the Cloud. The main capabilities of the S4 platform include:

  • Text analytics of unstructured content so that entities and the relations between them can be extracted from news, biomedical content and tweets,
  • Semantic graph database-as-a-service so that developers can add, interlink and query semantic facts from open knowledge graphs or extracted from text content via the text analytics services. The semantic graph DBaaS is based on Ontotext GraphDB triplestore (RDF database) and databases are instantly available to developers. The S4 platform takes care of DBA aspects such as configuration, operation, maintenance and scaling,
  • Access to large open knowledge graphs that enhance the semantic analysis process. The knowledge graph integrates different open knowledge graphs such as DBpedia, Wikidata & GeoNames.

The “Introduction to Linked Data and the Semantic Web” MOOC has specific requirements related to the SPARQL samples and exercises:

  • Access to a relatively large knowledge graph (almost 500 million explicit + inferred triples in the MusicBrainz dataset),
  • A mix of simple and complex queries,
  • Concurrent access by a potentially large number of students (100s at the same time),
  • High availability of the database storing the samples & exercises data.

The Self-Service Semantic Suite (S4) by Ontotext was a natural fit for the requirements of the MOOC because:

  • It provides fully managed and hosted semantic graph database-as-a-service instances (RDF triplestores), so that the tutors from the University of Southampton did not need to setup & maintain the databases & the underlying infrastructure on their own.
  • The DBaaS is designed for high availability, so that it can be instantly re-deployed and running on a new node in the Cloud, if the original server experiences problems, without interrupting the work of the uses and applications connected to the database.
  • The dataset is replicated on multiple DBaaS instances, so that the query load of a large number of users is load-balanced for improved throughput and response time.
  • New hardware resources can be automatically and instantly provisioned in order to adapt to a growing use of the system: new database nodes can be added, so that queries can be load-balanced among more database servers; new nodes hosting the web app with the SPARQL query UI can be instantly provisioned and activated in order to accommodate more concurrent users.

An additional benefit of using the S4 platform for the MOOC is that it provides free semantic graph database instances for up to 10 million triples, so that students who learn about Linked Data and SPARQL can continue improving their skills or start building prototypes without the need to maintain their own semantic graph database (RDF triplestore).

Next Steps

If you are new to Linked Data and the Semantic Web, check out the new MOOC by the University of Southampton, learn about SPARQL, get your own and free semantic graph database on S4, and start experimenting with Smart Data prototypes!

Marin Dimitrov

Marin Dimitrov

CTO at Ontotext
As the technological captain of Ontotext, he is leading the company on the right tech route and reserving our spot on the map of the world. His sharp mind can explain complex things in a simple way, making him an invaluable resource in semantics. Marin is a frequent speaker on semantic conferences and open data meetups at various technology related events.
Marin Dimitrov

Related Posts

  • Open data fosters a culture of creativity and innovation

    Open Data Innovation? Open Your Data And See It Happen.

    As more and more companies and startups are creating business and social value out of open data, the open data trend-setting governments and local authorities are not sitting idle and are opening up data sets and actively encouraging citizens, developers, and firms to innovate with open data.

  • Linked Open Data Sets

    Linked Data Innovation – A Key To Foster Business Growth

      ‘Data is the new oil’, once said Neelie Kroes,  former Vice-President of the European Commission responsible for the Digital Agenda, aptly describing how the growing amounts of data are changing businesses and our lives. The year…

  • featured image

    Linked Open Data for Cultural Heritage and Digital Humanities

    The Galleries, Libraries, Archives and Museums (GLAM) sector deals with complex and varied data. Integrating that data, especially across institutions, has always been a challenge. On the other hand, the value of linked data is especially high in this sector, since culture by its very nature is cross-border and interlinked.

Back to top