Semantic Search: The Paradigm Shift from Results to Relationships

Searching in big data

“Sorry, no content matched your criteria” is probably one of the most frustrating messages we can get after a search in the times when more and more of the world’s information is supposed to be at our fingertips, seemingly a tap or a click away. If we look behind that all too frequent frustration though, we’ll see an important reminder from today’s data-driven world:

The potential of data for knowledge discovery is only as big as the capacity to intelligently search through these data.

Understanding Semantic Search Through a Verse From Antiquity

Semantic search is what opens the door for such intelligent information retrieval.

When used for machines, the term intelligent raises not one or two eyebrows. The debate stirred when machines are said to be able to understand the meaning of content, however, subsides the moment one is faced with a pressing need to find the nearest pharmacy or speed up a research project, for instance. Not surprisingly, in these cases, we would gladly use an elaborate computational procedure to sift through data and pick the most relevant results for us, as quickly and as accurately as possible.

Semantic search does exactly that. It is an approach to querying data that seeks to understand (i.e. to compute) the intent and the context around a query in order to retrieve the most pertinent resources, related to the particular information request.

To grasp the semantic approach to search, it is useful to look a couple of millennia back, where an unexpected perspective on the process of understanding emerges.

Launched on the void, assail it not as yet
With keen-edged sickle, but let the leaves alone
Be culled with clip of fingers here and there.

Verg. G. 2.366

The verb “cull” in the last verse is a translation of the Latin verb “intellegere”. This Latin word literally means “to choose from”, it is formed from inter ‘between’ and legere ‘choose’. With time intellegere came to denote “understand” and to further grow more elaborate meanings. it’s present participle (intelligens, i.e. discerning) is the origin of the word intelligent.

That explained, the verse above can be read as another reminder, this time from the distant past, restating the fundamentals of understanding: the ability to pick from.

Networks of Relationships

Bye Bye Keywords, Hello Networks of Relationships

It is our capacity to discern noise from actionable information that moves us forward on our quest for finding the right information and ultimately answers. What the semantic approach to information retrieval does is enhance this capacity by utilizing the analytical powers of computers to dig huge amounts of data and surface their interconnections for us.

Neither text, nor any type of content could be exhaustively defined by their exact textual representation. They both are more of a fabric of relationships. Not just a mere sum of exact words and phrases, but a network of connected entities. It is by analyzing the relational aspects of these entities that semantic search is able to address complex queries, foster knowledge discovery and take information retrieval to the next level – from a list of results, solely based on keyword matching to a set of connections, pertinent to the intent and the context of the specific query.

Leveraging Semantic Web technologies, specifically, data represented in RDF and organized in formal collections of related entities (ontologies), semantic search turns the process of looking for information from “dumb” term matching into asking questions and getting answers.

By definition, semantic search reaches out beyond keywords and seeks to understand the semantics of the search query. It improves search accuracy by looking at both data and their connections. Instead of more links, which are only a single kind of relation, the algorithm presents you with a networked view of relations, facts, information, you might not know even existed.

Semantic Search Across the Web: The Inevitable Shift

“Search is key to making the Web useful, creating order out of its chaotic data, and making it navigable”, writes author David Amerland in his book Google Semantic Search (op. cit., p. 13)

The adoption of semantic technologies is inevitable for their potential to model real-world complexity and manage resources and their interrelations in a machine readable format. In contrast to search based on the occurrence of words in documents, querying interconnected pieces of data whose chain of relations can be followed allows for deeper and broader search experience.

On the Web, semantic search profoundly changes the landscape of SERPs. Google, Bing, Yahoo! and Yandex, to mention the major search engines, constantly optimize their algorithms, as to be able to return richer results and even answers to search queries. In their effort to enable Web-scale exchange of structured data through Schema.org (a collaborative, community activity creating, maintaining, and promoting schemas for structured data on the Internet, sponsored by Google, Microsoft, Yahoo and Yandex), search engines also enable the publishing of more and more semantic web data. This in turn makes semantic search more precise and reliable. The more data an algorithm is presented with, the better the chances it can accurately assess and verify them.

Put shortly, semantic search across the web becomes smarter and its potential to satiate the need for relevant results, save times and provide better user experience grows. And this is what keeps search engines in business. The same applies for enterprise semantic search – incrementally more and more companies realize the dire need for better information retrieval systems and more agile ways of managing knowledge across their structures.

Needed Data
 

Enterprise Semantic Search: Four Benefits and a Challenge

Within organizations and closed enterprise systems, semantic search implementation translates into efficient enterprise content usage. A semantic search built on top of an existing content management system brings new dimensions to extracting usable information out of huge amounts of heterogeneous data. This solves one of the big problems many organizations face: the massive volumes of dark data which is hard to discover by content creators.

Enabling the navigation of semantically integrated data, semantic search brings the benefits of hidden relationships discovery, information gathering beyond keywords and saves employees hours of fishing disparate data scattered across multiple resources.

Despite the advantages though, incorporating semantic search technology is still a challenge for many organizations. Semantic technologies, semantic search included, gain traction slowly as businesses are still hesitant about what this can buy them or are unwilling to invest long-term, still a bit short-sighted for the opportunities interlinking their data, content and the web opens, and looking for concrete short-term benefits.

The Quest for Meaning

The good news is major web search engines and larger organizations are already paving the road to a more meaningful web and more efficient enterprise content management systems, thus bringing good semantic search practices for others to take advantage of.

Steadily, the algorithms that understand the semantics of our searches are becoming smarter. With a smarter and more precise information retrieval approach, more correlations are being found, more clues are being presented, ultimately, more breakthroughs are being made. Semantic search proves to be not only a tool for exploring and retrieving information, but also a powerful way to cull knowledge out of data and to really help us put the world’s information at our fingertips.

Teodora Petkova

Teodora Petkova

Teodora is a philologist fascinated by the metamorphoses of text on the Web. Curious about our networked lives, she explores how the Semantic Web vision unfolds, transforming the possibilities of the written word.
Teodora Petkova
  • For more on “relationships at the heart of semantics and the semantic web”: http://www.slideshare.net/apsheth/relationship-web-trailblazing-analytics-and-computing-for-human-experience

  • Thank you for the link!

  • Christopher Courington

    I LOVE this post by Ms. Petkova. Two questions come to mind. First, not from antiquity, but from the early 20th century: the Semiotics of Charles Sanders Peirce would seem to offer a potentially robust and exciting structure to facilitate the capture of context and intent through a matrix of mathematically deduced signifiers (I am not the person who can create such equations, however). Second, however, I am concerned about a far more nuanced aspect of knowledge discovery: the fortuitous or unexpected accident, which usually emerges as a result of laborious elimination of many dead ends. It would seem that no matter the capability of semantic search, it would eliminate the often marvelous unexpected discovery that is a fundamental part of all scientific progress. Do you share this concern, or am I making much ado about nothing?

  • Thank you, Christopher Courington for reading and loving 🙂 Your concern not much ado about nothing at all. This is the concern of search “designers” too. Serendipity is much sought, as it is clear that knowledge is by far not only logic and analysis (plus knowledge in a bubble is not knowledge). You might like this paper: Discovery Is Never by Chance: Designing for (Un)Serendipity – http://goo.gl/snA8jP.
    As for the semiotics part, capturing context and intent are exactly what semantic search is all about – context = relationships (linked data), intent=personalization (based on previous history of interactions).

Related Posts

  • Featured image

    The Knowledge Discovery Quest

    Surrounded by millions of bits of information, in today’s digital world we are on a knowledge discovery quest. On this quest semantic search is key. It helps us explore connections and gather information from seemingly disparate sources.…

  • Panama_papers_200x200

    Linked Leaks: A Smart Dive into Analyzing the Panama Papers

    Ever since the Panama Papers news story broke in early April, people have been curious to know what names come out and how they are connected with other companies and shareholders. However, releasing the massive of 2.6TB of data could be a challenge for data enthusiasts and investigative journalists to effectively search and explore the Panama Papers data. That’s how Linked Leaks was born.

  • Journalism in the Age of Open Data

    Journalism in the Age of Open Data

    Open Data has the potential to enrich the sources for journalists and give the stories they tell new perspectives. Journalism, in turn, filters open data to discover new angles to topics and tell richer stories to the audience. However, to turn data into meaning, we need context. Semantic technology provides that context allowing media organizations to extract better insights and ultimately improve story telling capabilities.

Back to top