Text Mining in the Cloud: Ontotext Releases “S4” – The Self Service Semantic Suite

Text Mining in the Cloud – Enterprise Technology for the Mid Market

If you thought that enterprise semantic technology was just for Fortune 2000 companies, think again. The very same semantic technology is now available in the Amazon Cloud for a fraction of the price. It’s called “Ontotext S4″ which stands for “The Self Service Semantic Suite.” S4 provides text mining in the cloud along with other semantic technology – all in a hosted environment.  Before we discuss the components, let’s talk about why businesses use semantics.

Over the last 5-10 years, businesses and government entities have focused on storing documents in content stores allowing them to search more effectively. This is helpful since the search retrieves a set of relevant documents based on titles and the metadata that describes the document. But the vast majority of the real intelligence – the people, places, organizations, events AND relationships to one another – are stored in the free flowing text WITHIN the document. It’s these semantic facts that need to be extracted and searched – along with the document – to isolate the most salient facts. Most people call this “text mining”, “text analysis” or “natural language processing”. It’s so much more. When users do this, they uncover real meaning. They understand how people are connected. They get right to the heart of the matter which, in turn, allows them to make decisions faster. Up until now, this high performance technology was only available using enterprise pricing and deployment models. Organizations needed to buy the software licenses, provision the hardware, install and maintain the applications. With S4, a new and refreshingly easy approach is possible.

S4  has been created for developers and system integrators interested in building applications.  S4 provides on-demand and inexpensive building blocks to create these applications. What types of applications can developers build?    The list is long but here are some ideas:

  1. Law firms can analyze news to see if people associated with a workman’s compensation case are playing soccer on weekends.
  2. Life Sciences companies can extract meaning from biomedical research or search clinical trials to match a treatment strategy to a diagnosis.
  3. Media companies can mine the facts inside historical news articles, annotate them and store results inside a graph database.  Authors working on articles can be prompted with relevant content making them more efficient while also creating more relevant news stories.
  4. Banks and brokerages can analyze text hidden in financial instruments used to describe investments and other products.  They can make the documents discoverable online.
  5. Consumer Packaged Goods can measure trends in sentiment for new or existing products.
  6. Retailers can mine product descriptions, customer comments and text on competitive websites.  Facts can be stored and used in search and discovery applications.
  7. Manufacturers can analyze the text in service manuals allowing service departments and distribution partners to quickly retrieve a very precise section of the manual that needs to be reviewed.
  8. Healthcare organizations and associations can analyze the text in specialty focused journals, combine the results with patient record analysis and lab results providing access to targeted information physicians can use to improve treatment outcomes.


What’s included?    S4 provides a scalable and resilient set of cloud based services that a developer can use to build a semantic application. These tools include:

  • Text mining services for news, life science & bio-medical texts and social media that allow you to extract valuable meaning and insights from your content.
  • On-demand access to key Linked Open datasets, such as DBpedia, Freebase and GeoNames. With tens of billions of semantic facts stored in this semantic knowledge base, organizations can enhance their semantic analysis enriched leading to  improved search results and more insights.
  • An enterprise class triplestore, GraphDB™ which is now available as-a-service in either a self-managed or a fully-managed way. This allows you to store results from your text analysis in your private triplestore on the cloud, load linked open data (GeoNames, Freebase, and DBPedia), search and update everything.  With GraphDB™ the semantic facts are stored and synchronized with the original documents saving precious compute resources while also enhancing discoverability. With the fully-managed version of GraphDB™ running in the the cloud, you don’t need to worry about hardware provisioning, installation and updates, or maintenance and backups. We take care of that for you.


Why do the techies like it?    Here are a few of the reasons:

  • S4 is based on enterprise grade technology by Ontotext including the leading RDF triplestore (GraphDB™), Semantic Biomedical Tagger tuned for high performance text analysis over biomedical texts, the Semantic Publishing platform which is tuned for content enrichment of news sources and FactForge, a knowledge base providing efficient mechanisms for querying billions of facts from key open data sources.
  • The Ontotext technology has been successfully applied in various domains such as media & publishing, life sciences, cultural heritage, and compliance & document management.. With S4 you can instantly get access to advanced semantic analysis and database management capabilities, which can be applied in any domain.  Text mining in the cloud has never been easier and more affordable.


Where’s the value?    For starters, this service is always on and managed by us. You don’t pay enterprise prices. You pay as go, on demand. In other words, all the power at a fraction of the cost. On top of that, S4 provides a free tier for all services. Economic drivers aside, users get intelligence to enhance the productivity of their business. Now you can compete using “text analysis!”

What does this look like?    Let’s start with a fun example and then one that is more complex. The article below was written the morning before the World Cup was played. I wanted to find out who the sports news pundits were predicting would win the World Cup and why. I searched for a news article and found one from ESPN. As it appeared on the web, it looked like this:


  S4-News-0 There was a bit more to the article than this but anyway, once I submitted it through the S4 News Analysis Services, the results looked like this:
S4 has identified people, places, organizations and relationships.  Through this analysis is determined  that one key word is “defense” and it highlights this.   As a reader, I focused on this and quickly found out that the left side of Germany’s defense is vulnerable.   Will Lionel Messi exploit this side and win the World Cup?  We shall see.  In any event, you get the idea. Now, let’s give S4 a harder example, a biomedical article on the popular drug “atorvastatin” which is used to help keep the arteries in your heart free of plaque.  The article is filled with technical terms.   Take a look at the results from S4:
The results include qualitative concepts, quantitative concepts, social behavior, references to steroids, references to therapeutic or preventive procedures and much more.   The meaning in this document is interpreted by S4, taking a lot of the labor out of reading and analysis.  Most importantly, the facts analyzed can be stored inside of GraphDB™ and surfaced later when users search for for very specific topics.  In essence this provides microscopic analysis of your text that analysts can quickly scan to find exactly what they are looking for. How can you Try S4 for free?  Go to the S4 website, try out the various services, register for an account and start using the S4 services within the free tier. You will also find documents and a link to sign up for the services which is quick and easy. Developers interested in building applications that analyze text and create practical useful search applications using an on-demand, cost effective service should experiment with S4.  If you who need to satisfy the needs of the business and want to implement a ground breaking semantic technology in cloud, S4 is the right set of services which include text mining in the cloud and so much more.
Milena Yankova

Milena Yankova

Director Global Marketing at Ontotext
A bright lady with a PhD in Computer Science, Milena's path started in the role of a developer, passed through project and quickly led her to product management. For her a constant source of miracles is how technology supports and alters our behaviour, engagement and social connections.
Milena Yankova

Related Posts

  • Live Online Training 4: Designing Semantic Technology Proof-of-Concept

    Live Online Training: Meet The Ontotext Experts

    You are confused about how Semantic Technology provides value to a business case? Not sure about the utility of a graph database? This live online training teaches you how to load, transform, query and link your data, and develop a small Proof-of-Concept to demonstrate the power of Semantic Technology and GraphDB for a specific use case.

  • Live Online Training 4: Designing Semantic Technology Proof-of-Concept

    Live Online Training: What is a Successful Semantic Technology Proof-of-Concept

    You are confused about how Semantic Technology provides value to a business case? Not sure about the utility of a graph database? This live online training teaches you how to load, transform, query and link your data, and develop a small Proof-of-Concept to demonstrate the power of Semantic Technology and GraphDB for a specific use case.

  • Featured Image

    Live Online Training: What You Will Learn

    Live online training meant to teach you how to develop a small Proof-of-Concept that utilizes the power of Semantic Technology and GraphDB.

Back to top