A few weeks ago I gave a presentation titled “Enabling low-cost Open Data Publication and Reuse” at the Data Summit Brussels. The presentation was based on the ongoing work in one of the EC funded research projects that Ontotext participates in: DaPaaS, which has the goal of developing a platform for Open Data publishing and access.
In recent years, Open Data initiatives have been growing at a rapid pace worldwide. More and more data (mostly from government organizations) has been made available for open access. Organizations such as the Open Data Institute have set their mission to educate government organizations and SMEs on how to publish, utilize and monetize Open Data. It has a significant potential for improving the transparency and quality of public services, as well as optimize costs and improve innovation in various industry sectors. IN In the McKinsey report titled “Open data: Unlocking innovation and performance with liquid information” analysts wrote:
“Our research suggests that seven sectors alone could generate more than $3 trillion a year in additional value as a result of open data, which is already giving rise to hundreds of entrepreneurial businesses and helping established companies to segment markets, define new products and services, and improve the efficiency and effectiveness of operations”
At the same time, there are some challenges to the wider adoption of Open Data: the quality of data is a significant problem. Lots of organizations are obligated to open up their data, but lack the required expertise, tooling, resources or sustainability plans on how to make this data useful, how to improve its quality and how to maintain it with regular updates and enhancements. Most of the Open Data available these days is really just plain CSV files of questionable quality, which are difficult to access and use. While some organizations will reference the hundreds of thousands of open datasets available as a proof for the success of the Open Data movement, very few have taken a more critical look at the quality, usage statistics, and value that most of these datasets provide. Additional factors limiting the adoption of Open Data include the lack of expertise, resources and commitment by many organizations to make data available as live data services and APIs easily accessible to 3rd party applications.
DaPaaS is an EC funded research project that has the goal of making it easier to publish and reuse Open Data. The partners in the project include: Ontotext (Bulgaria), SINTEF (Norway), Swirrl (UK), Open Data Institute (UK), and Sirma Mobile (Bulgaria), as well as an associated partner from South Korea: Saltlux.
The DaPaaS project has chosen the Linked Data paradigm as a way to publish and consume Open Data, so that the data can be better described, interlinked and queried in a way that is not possible utilizing the traditional approaches of CVS files or very simple Web APIs providing access to Open Data. The key building blocks of the DaPaaS platform include:
The DaPaaS Open Data platform will soon be open to the general public. More information is available via Twitter and email. For details on my presentation view the slide deck from my talk on SlideShare.