How the Self-Service Semantic Suite Powers up an Open Data Platform (Part 1)

European Data Forum 2015

The European Data Forum takes place this week in Luxembourg, on Nov 16th and 17th. EDF is one of the leading industry forums for discussing the challenges related to Big Data, Linked Data, Open Data and the emerging “data economy.”


The DaPaaS project, where Ontotext is a technology partner, is also presenting at EDF2015 and the topic of the talk is the DataGraft platform for Open Data publishing, which the DaPaaS project has been developing in the last 2 years.

DataGraft Platform for Open Data Publishing

Even though the Open Data movement has witnessed a growing adoption among government organisations as well as SMEs, the current Open Data landscape consists mostly of static data files with tabular data, which are often difficult to access, integrate and interlink with other data, and in most cases not enriched with important metadata. Due to these issues, such tabular data files are not easy to reuse, even if easy to publish. The market needs for better tools and approaches that will enable publishing and hosting of live data services for improved access to and reuse of Open Data.

The goal of the DataGraft platform is to provide Open Data publishers — who may not have sufficient technical expertise, infrastructure or the resources required — to push the data publishing process beyond static tabular files and into live data services, which can be easily and reliably accessed by 3rd party applications and data mashups.


The differentiating capabilities of the DataGraft platform include:

  • Moving beyond tabular data and adopting Linked Data as the means to publish, interlink and reuse data. Data is easily query-able and not just available as a static data dump.
  • Providing frontend tools that support data transformation & data quality processes. Data transformations are reusable and can be shared among different users.
  • Simple and well documented APIs for developers, for accessing key platform services and the Open Data services.
  • Scalable data transformations and hosting even of large number of live data services and concurrent users (applications).

The major components of the DataGraft platform include:

  • Grafter – an open source suite of tools for tabular data transformation & processing. Grafter can be used to transform tabular data formats to tabular data formats (for data quality / cleanup purposes), or to Linked Data format (for publishing a live data service on the platform). A key feature of Grafter is that the resulting transformations can be serialised, shared between users and repeatedly executed over data (e.g. data transformation services).
  • Grafterizer – the frontend framework for the Grafter suite. It assists data publishers with creating (Grafter) data transformations and mappings to Linked Data ontologies and vocabularies.
  • The Database-as-a-Service layer which turns the RDF-ised legacy data into live data services, easily accessible and query-able by developers and applications. A key feature of the data layer is scalability and reliability.
  • The Data Portal which provides a catalogue of various datasets (data services) as well as reusable data transformation services.
  • The Personalized and Localized Urban Quality Index (PLUQI) application, which is a Smart City demonstrator on top of the DataGraft platform. PLUQI uses DataGraft to integrate various open datasets (e.g. transportation, weather, crime statistics, financial indicators, etc.) and computes various indexes on well-being and sustainability of cities, and visualizes them through a fancy GUI.

The key benefit that DataGraft provides to data workers and application developers include is the increase in the speed of publishing new datasets and updating existing datasets through the provision of a sound methodology and integrated toolset that will support the full linked open data lifecycle.

The Role of the Self-Service Semantic Suite (S4)

The Self-Service Semantic Suite (S4) provides an integrated platform for on-demand Smart Data management. With S4 developers get instant access to various capabilities for text analytics, knowledge graphs and an RDF graph database-as-a-service in the Cloud. By providing an easily and instantly accessible set of services, the S4 platform enables faster and cheaper prototyping of applications for Smart Data analytics.

One of the core capabilities of the S4 platform – the RDF graph database-as-a-service (DBaaS) – plays an important role in the DataGraft platform too. It enables instant deployment of RDF databases in the cloud, which are used to host Open Data that was transformed from tabular files into Linked Data. The RDF databases make it possible to publish Open Data as live data services, accessible for querying (via SPARQL) and simple RESTful API, as opposed to the legacy static data files.

The DBaaS infrastructure of the S4 platform, which is also integrated with the DataGraft platform, takes care of the provisioning, deployment, availability, scalability, monitoring and operations, so that the partners operating the DataGraft platform won’t need to spend time on such tasks. Additionally, DataGraft benefits from automated upgrades to newer and improved versions of the RDF database used (GraphDB™ by Ontotext). The DBaaS infrastructure of S4 is also designed for elasticity, so that it automatically adapts to a growing number of hosted datasets, or query/API request rate.

Final Words

DataGraft provides an Open Data platform which makes data publishing and consumption easier, faster and cheaper. The Self-Service Semantic Suite (S4) platform by Ontotext plays a key role in DataGraft, by  providing an based RDF graph database-as-a-service in the cloud, for hosting a large number of Open Data sets (transformed as Linked Data), and enabling live data services via simple RESTful APIs to 3rd party applications.

More information on DataGraft and S4 is available here:

  • DataGraft documentation
  • Low-cost Open Data As-a-Service in the Cloudpaper & video on the RDF graph database-as-a-service available via S4
Marin Dimitrov

Marin Dimitrov

CTO at Ontotext
As the technological captain of Ontotext, he is leading the company on the right tech route and reserving our spot on the map of the world. His sharp mind can explain complex things in a simple way, making him an invaluable resource in semantics. Marin is a frequent speaker on semantic conferences and open data meetups at various technology related events.
Marin Dimitrov

Related Posts

  • Featured Image

    Weaving Data Into Texts: The Value of Semantic Annotation

    Semantic annotation is about weaving data into textual sources. In semantically annotated texts, certain words (denoting things, people, locations, organizations, etc) are linked to data – that is, to context and references that can be processed by an algorithm.

  • Datathon Case Overview: Revealing Hidden Links Through Open Data

    For the first Datathon in Central and Eastern Europe, the Data Science Society team and the partner companies provided various business cases in the field of data science, offering challenges to the participants who set out to solve them in less than 48 hours. At the end of the event, there were 16 teams presenting their results after a weekend of work.

  • Featured Image

    Exploring Linked Open Data with FactForge

    Our way out of data confusion and into data abundance is the portion of the growingly interconnected data on the web. With FactForge as a convenient entry point to the web of interconnected data, we can turn the exciting opportunities that data flows on the web can pour into our business into real experience.

Back to top