How the Self-Service Semantic Suite Powers up an Open Data Platform (Part 2)
Back in November 2015 we published a short blog post on the DataGraft platform for Open Data publishing, powered by the Self-Service Semantic Suite (S4). DataGraft will also be presented and demonstrated at the upcoming European Semantic Web Conference (ESWC) in May 2016, so that developers and Open Data enthusiasts will have a chance to learn more about the platform and interact with the international team behind it.
The 13th European Semantic Web Conference
The ESWC will take place between May 29th and June 2nd. The DataGraft platform will be presented and demonstrated within two of its sessions:
Short summaries of the platform and the demo scenario are available from the conference and workshop papers (1 and 2).
Easy Open Data Publishing with DataGraft
The Open Data movement has witnessed growing popularity in the recent years, but publishing and consuming of such data on a larger scale still faces significant obstacles:
- the technical complexity of preparing Open Data for publication is high;
- the considerable cost for publishing data and providing reliable access to it;
- the poorly maintained and fragmented supply of Open Data.
DataGraft is a cloud-based platform with the goal of making Open Data (and Linked Data) publishing and reuse easier and cheaper. Its main capabilities are:
- Interactive design of data transformations: transformations that provide feedback to publishers on how data changes;
- Repeatable transformations: data transformation processes often need to be repeatedly executed as new data arrives;
- Reusable transformations: reusing and extending data transformations created by other developers further lowers the data publication cost;
- Reliable data access: provisioning data reliably is a key aspect for 3rd party data services and applications using Open Data.
The key components of the DataGraft platform include:
- Grafter, an open source framework of reusable components for complex data transformations. The main advantages of Grafter over similar ETL frameworks include: 1) efficient support for very large datasets (due to its streaming approach for data processing); 2) its highly modular and extensible design; 3) the ability to serialize and execute transformations as services in a sandboxed environment.
- Grafterizer is an open source web-based tool for data cleaning and transformation built on top of Grafter. It provides an interactive UI for: 1) forking of existing data transformations; 2) creating complex data transformation workflows; and 3) live preview of the data transformation over sample data.
- The Semantic Graph Database, which is managing the Linked Data hosted on the platform. With a database-as-a-service solution, data publishers do not need to deal with administrative overheads such as installation, upgrades and maintenance, provisioning, etc. From the point of view of a data publisher or a data consumer, the DBaaS provides standard APIs and endpoints for Linked Data access, querying, and management. The Semantic Graph Database of the platform is based on the Self-Service Semantic Suite (S4) and is already hosting hundreds of free database instances in the Cloud.
- The Open Data portal integrates the components together in a simple wizard-like web interface. The platform also provides simple visualization widgets: tables, charts and maps.
DataGraft + ProDataMarket
DataGraft will be demonstrated at ESWC 2016 with sample data and use case from the ProDataMarket research project, which has the goal of building a data marketplace for property related data, so that innovative applications and services can easily be built on top of such data.
Meet Us at ESWC 2016
If you are attending the event, you should swing by the Developers Hackshop on Monday or the demo session on Thursday, and meet with the international team behind the DataGraft platform to learn how it can help you publish Open Data!
CTO at Ontotext
As the technological captain of Ontotext, he is leading the company on the right tech route and reserving our spot on the map of the world. His sharp mind can explain complex things in a simple way, making him an invaluable resource in semantics. Marin is a frequent speaker on semantic conferences and open data meetups at various technology related events.
Latest posts by Marin Dimitrov (see all)