Springer Nature Uses LOD to Create a Rich Database for Scientists to Work Together

Springer Nature advances discovery by publishing robust and insightful research, supporting the development of new areas of knowledge, making ideas and information accessible around the world, and leading the way on open access. As a research publisher, it is home to trusted brands including Springer Nature Research, BioMed Central, Palgrave Macmillan and Scientific American. Springer Nature is also a leading educational and professional publisher, providing quality content through a range of innovative platforms, products and services.

In 2017, the scientific publisher chose Ontotext’s semantic graph database GraphDB™ to power its new Linked Open Data platform, Springer Nature SciGraph, aggregating data sources from Springer Nature and key partners from the scholarly domain.

The Goal

Springer Nature’s archive goes back to 1815 and currently amounts to 13 million publications. There are about 500 000 documents published every year (articles, chapters, journals, books, etc.) and this figure is expected to increase significantly in the future.

But the long-term goal is not to create just a repository of publications where Springer Nature can make the content they create and maintain available. The new platform for scientists will be richly interlinked both at the user and the content level and will be powered by a knowledge graph representing the concepts that Springer Nature’s readers care about.

The Challenge

In research communities, as in a lot of enterprises, there are many islands of content where knowledge is locked. The challenge is to make all the pieces of that knowledge talk to each other to allow better discoverability of information and ultimately its’ optimal use.

Some of the biggest problems when content is scattered and disconnected are:

  • content integration issues;
  • no opportunity for content analytics;
  • redundancies in production workflows;
  • lack of discoverability;
  • underutilized archival material.

For example, a site organized around articles, journals and issues does little for scientists who are interested in answering questions about real world things. Simply put, search engines do not know how to find the wealth of scientific content about objects in the real world in order to provide users with what they are looking for.

The Solution: An LOD Platform Powered by GraphDB

At a high level of abstraction, there are three areas of knowledge that scientists are most concerned about: the science they are interested in, the documents they read and write, and the people who carry out scientific research. This simplified picture can be further broadened with, for example, the institutions people work for, the organizations funding their research, the conferences they attend, the research group they belong to, and many more. It is best modeled in a knowledge graph, which generally represents real world objects.

Springer Nature Use Case

As illustrated in this example, the knowledge graph enables users to easily follow the relationships researchers have with everything they do and it does it in a way that closely resembles human thinking.

SN SciGraph, which is projected to contain 1.5 to 2 billion triples, makes use of such a knowledge graph database, GraphDB, thanks to its’ ability to handle massive load, querying, and inferencing in real time.

By seamlessly integrating disparate silos of content, GraphDB allows Springer Nature’s LOD platform to comprise metadata from journals and articles, books and chapters, organizations, institutions, funders, research grants, patents, clinical trials, substances, conference series, events, citations and reference networks, Altmetrics, and links to research datasets.

Diagram Scigraph Springer nature

Ontotext’s technology helps SN SciGraph to collate high-quality content from trusted and reliable sources across the research landscape. This high-quality data provides a rich semantic description of how information is related and visualizes the scholarly domain in interesting new ways.

Some of the main benefits of the new LOD platform powered by GraphDB are:

  • overcoming internal and external content silos in research communities;
  • broadening users’ perspective by semantic relations being revealed visually;
  • encouraging developers to reuse Springer Nature’s datasets;
  • easily accessing high quality content from trusted and reliable sources;
  • finding optimal content for analysis and recommendation tools for funders, librarians, conference organizers, etc.;
  • increasing the discoverability of publications due to large parts of the datasets being freely accessible (CC BY-NC 4.0 license).

Springer Nature has a longstanding commitment to making science more accessible as well as facilitating scientist to work together. Thanks to SN SciGraph, the company now belongs to the vanguard of LOD providers and has assumed a leading role among open data publishers and open research supporters.


Back to top