How the Self-Service Semantic Suite Powers up an Open Data Platform (Part 2)

Back in November 2015 we published a short blog post on the DataGraft platform for Open Data publishing, powered by the Self-Service Semantic Suite (S4). DataGraft will also be presented and demonstrated at the upcoming European Semantic Web Conference (ESWC) in May 2016, so that developers and Open Data enthusiasts will have a chance to learn more about the platform and interact with the international team behind it.

The 13th European Semantic Web Conference

The ESWC will take place between May 29th and June 2nd. The DataGraft platform will be presented and demonstrated within two of its sessions:

Short summaries of the platform and the demo scenario are available from the conference and workshop papers (1 and 2).

Easy Open Data Publishing with DataGraft

The Open Data movement has witnessed growing popularity in the recent years, but publishing and consuming of such data on a larger scale still faces significant obstacles:

  • the technical complexity of preparing Open Data for publication is high;
  • the considerable cost for publishing data and providing reliable access to it;
  • the poorly maintained and fragmented supply of Open Data.

DataGraft is a cloud-based platform with the goal of making Open Data (and Linked Data) publishing and reuse easier and cheaper. Its main capabilities are:

  • Interactive design of data transformations: transformations that provide feedback to publishers on how data changes;
  • Repeatable transformations: data transformation processes often need to be repeatedly executed as new data arrives;
  • Reusable transformations: reusing and extending data transformations created by other developers further lowers the data publication cost;
  • Reliable data access: provisioning data reliably is a key aspect for 3rd party data services and applications using Open Data.

DataGraft ComponentsThe key components of the DataGraft platform include:

  • Grafter, an open source framework of reusable components for complex data transformations. The main advantages of Grafter over similar ETL frameworks include: 1) efficient support for very large datasets (due to its streaming approach for data processing); 2) its highly modular and extensible design; 3) the ability to serialize and execute transformations as services in a sandboxed environment.
  • Grafterizer is an open source web-based tool for data cleaning and transformation built on top of Grafter. It provides an interactive UI for: 1) forking of existing data transformations; 2) creating complex data transformation workflows; and 3) live preview of the data transformation over sample data.
  • The Semantic Graph Database, which is managing the Linked Data hosted on the platform. With a database-as-a-service solution, data publishers do not need to deal with administrative overheads such as installation, upgrades and maintenance, provisioning, etc. From the point of view of a data publisher or a data consumer, the DBaaS provides standard APIs and endpoints for Linked Data access, querying, and management. The Semantic Graph Database of the platform is based on the Self-Service Semantic Suite (S4) and is already hosting hundreds of free database instances in the Cloud.
  • The Open Data portal integrates the components together in a simple wizard-like web interface. The platform also provides simple visualization widgets: tables, charts and maps.

DataGraft + ProDataMarket

DataGraft will be demonstrated at ESWC 2016 with sample data and use case from the ProDataMarket research project, which has the goal of building a data marketplace for property related data, so that innovative applications and services can easily be built on top of such data.


Meet Us at ESWC 2016

If you are attending the event, you should swing by the Developers Hackshop on Monday or the demo session on Thursday, and meet with the international team behind the DataGraft platform to learn how it can help you publish Open Data!

Marin Dimitrov

Marin Dimitrov

CTO at Ontotext
As the technological captain of Ontotext, he is leading the company on the right tech route and reserving our spot on the map of the world. His sharp mind can explain complex things in a simple way, making him an invaluable resource in semantics. Marin is a frequent speaker on semantic conferences and open data meetups at various technology related events.
Marin Dimitrov

Related Posts

  • Featured Image

    Weaving Data Into Texts: The Value of Semantic Annotation

    Semantic annotation is about weaving data into textual sources. In semantically annotated texts, certain words (denoting things, people, locations, organizations, etc) are linked to data – that is, to context and references that can be processed by an algorithm.

  • Datathon Case Overview: Revealing Hidden Links Through Open Data

    For the first Datathon in Central and Eastern Europe, the Data Science Society team and the partner companies provided various business cases in the field of data science, offering challenges to the participants who set out to solve them in less than 48 hours. At the end of the event, there were 16 teams presenting their results after a weekend of work.

  • Featured Image

    Exploring Linked Open Data with FactForge

    Our way out of data confusion and into data abundance is the portion of the growingly interconnected data on the web. With FactForge as a convenient entry point to the web of interconnected data, we can turn the exciting opportunities that data flows on the web can pour into our business into real experience.

Back to top