How BERT is Changing the Chatbot Landscape

Last year, I wrote a post about conversational AI wondering if it could ever shed its dunce cap. I am happy to report that major strides are being made and NLP is on the cusp of huge change.

I'm with BERT.png

NLP’s Evolution from Dumb to Smart

First, to understand why things are changing so fast, we need a quick review of NLP’s history. Before the 1980s, most NLP systems were rules-based and grounded by the work of Noam Chomsky who believed that the rules of grammar (transformational-generative grammar) could be used to understand semantic relations and thus lead machines to an understanding of speech. However, in the late 80’s machine learning algorithms become increasingly popular and the shift from rules to statistical models began. The next big NLP leap took place in 2013 with the introduction of word embeddings such as Word2vec, GloVe, and FastText. Word embeddings attempt to encapsulate the “meaning” of a word in a vector after reading massive amounts of text and analyzing how each word appears in various contexts across a dataset. The idea is that words with similar meaning will have similar vectors. The biggest drawback with these first-generation word embeddings was that each word had only one vector when it can in fact have multiple meanings (for example Mercury is a planet, a metal, a car, or a Roman god). These drawbacks are a result of the fact that early word embedding models train with a small neural network (shallow training) for efficiency reasons. However, with Google’s release of BERT we are indeed at an inflection point.

 What Makes BERT so Amazing?

Three things:

1.      BERT is a contextual model, which means that word embeddings are generated based on the context of the word’s use in a sentence, and thus a single word can have multiple embeddings. For example, BERT would produce different embeddings for Mercury in the following two sentences: “Mercury is visible in the night sky” and “Mercury is often confused with Hermes, the fleet-footed messenger of Greek gods.”

2.      BERT enables transfer learning. This is referred to as “NLP’s ImageNet Moment.” Google has pre-trained BERT on Wikipedia and this pre-trained model can now be used on other more specific datasets like a customer support bot for your company. And remember this pre-training is expensive, which you can now skip. So, your starting point is a smart model (trained on general human speech) not just an algorithm in need of training.

3.      BERT can be fine-tuned cheaply and quickly on a small set of domain-specific data and will yield more accurate results than by training on these same domain-specific datasets from scratch.

Holy smokes!!!!!

BERT – Coming to an Application Near You

But what’s even more exciting, while lots of changes in AI and machine learning are happening behind the scenes, much of this next generation NLP is already being used in consumer products that you and I use every day.

If you use Gmail you know exactly what I am talking about.

  • Suggested replies to emails – BERT.

  • Suggestions for the next word in the sentence – BERT.

I love these capabilities in Gmail, and BERT is now being utilized in many conversational AI applications. So, your chatbot should be getting smarter.

Data is Still King

Note, that two critical elements enabled Google to build BERT. The first is Google’s trove of data and its ability to continuously refine BERT. Let’s go back to the Gmail example of auto-suggesting the next word. Every time you accept a suggestion and use that word, you are training the model. Every time you keep typing and use a different word from the suggestion, you are training the model. If using Gmail makes Google the smartest BERT practitioner on the planet, how can the little guy/gal ever catch up?

Moore’s Law is Alive and Well

The second key element enabling advances such as BERT is the continuing increase in the speed and capability of computers, especially NVIDIA’s GPUs and Google’s TPUs. Remember early word embedding models had to be highly efficient due to the state and cost of computing, BERT is far less efficient, but computing power has more than caught up. In fact, NVIDIA has just announced that they are supporting BERT and now claim that their AI platform has the fastest BERT training capabilities available. In addition, they claim to achieve very fast predictions (responses), which are needed in real time chat applications. In fact, they have created the Inception Program to help conversational AI startups.

In closing

BERT, and models like it, are game-changers in NLP. Computers can better understand speech and respond intelligently in real-time. If NLP’s dunce cap hasn’t been fully swept away, it will be soon.

What are your thoughts on the state of NLP and BERT?  Drop us an email.

Kicking off 2019 with a Bit of Bot Humor

First, I would like to thank our customers, partners and associates for making 2018 a great year for Informatics4AI.

Throughout 2018 we discussed how to make bots smarter with natural language processing (NLP), so for a change of pace we thought we’d kick-off 2019 on the lighter side with some bot humor.

Funny robot.png

First up, I’ll share a spoof on Bots that I found hilarious.

Keaton Patti claims to have made a bot watch 1,000 hours of Christmas movies and then write it's own script. The script has been edited to protect trademarks.

xmas poem.png

Please note, this is humor and was most assuredly NOT WRITTEN BY A BOT. After all, how does a bot watch a movie? (Answer – two ways:  a) use a voice to text API or b) use the closed caption text  - either of which can be fed to the bot).

Last year we discussed some actual bot humor and here is some funny stuff written by a real bot.

Janelle C Shane got the following recipe when she trained a neural network on a database of about 30,000 recipes and then asked the machine to produce a new recipe: 


2 pkg hershey’s can be prepared in unpeeled

1  smaller

½ cup yellow onions you may

1 cup egg; chilled, coursely chopped

½ lb bacon, chopped

1 ½ cup sugar, grated

4 oz square oil

Halve the finely chopped fresh garlic salt and pepper. Break the meat into the pineapples and pat them, scraping the room off the skillet. Add ghees and beer and bring to a boil; cover and simmer, uncovered, on High for 20 to 30 minutes or until the onion thickens.

Bots can definitely be funny, but usually not on purpose. In another experiment,  Janelle C Shane trained a bot on a dataset of jokes. Here are a few jokes the bot generated:

What did the new ants say after a dog?

It was a pirate.


 Why did the monsters change a lightbulb?

And a cow the cough.


What do you call a pastor cross the road?

He take the chicken.


Why was six afraid of seven?

Because he doesn’t have a birthday?


These are hilariously stupid, but not really funny.  Humor and jokes teach us the same lesson about bots that other examples do, which is that unfortunately, bots are not ready to fully replace humans in understanding full-text or responding to conversations. But NLP can help a lot!

Here's to a great 2019. Cheers.


Can Conversational AI Shed its Dunce Cap?

Conversational AI is everywhere. (E.g. Alexa, Siri, Google Assistant, as well as thousands of lesser known chatbots). However, as discussed in the last post, these systems currently function well only with simple tasks, because the bots cannot understand normal everyday conversation the way a human assistant does. As a result, we’ve largely given up using them for complex tasks.

Current chatbots are not very smart

Current chatbots are not very smart

As we saw at this year’s Google I/O conference, Google demonstrated that we are on the cusp of change and by using several new techniques, AI is now capable of understanding complex conversations and responding in natural and intelligent ways.

The most popular technique for helping machines understand conversational speech is by using word embeddings, which attempt to encapsulate the “meaning” of a word in a vector after reading massive amounts of text and analyzing how each word appears in various contexts across a dataset. The idea is that words with similar meaning will have similar vectors. Word2Vec & GloVe are currently the most popular word embedding algorithms. But as Sebastian Ruder, Research Scientist at AYLIEN, notes “learning word vectors is like {an image recognition system that} only learning edges.” It works for certain situations where the problem is simple or straightforward, but not when things are complex.

Let’s look at several new techniques that attempt to move beyond Word2Vec’s shallow approach and embed text with meaning in a richer way.

1) Narrow the Scope and Train Intensively

While this technique works, it is more of an “idiot savant” approach, where the bot will be able to converse across a narrow domain quite well but will be really dumb about everything else. This is okay in some situations, but when using this approach, it is especially important that the users know the chatbot is a computer, so that when the bot says something silly the user knows why.

This was a core technique used by Google in its I/O conference demo, when Google Assistant booked an appointment at a hair salon, and then made a restaurant reservation. As Google explained, the training was intensive and narrow in scope. But what would have happened if the human had decided to make small talk and asked, “How about them Red Sox?” Google noted that Google Assistant was not ready to “carry out general conversations,” so the response would probably have been hilarious or embarrassing.

2) Next Generation Word Embeddings

A paradigm shift is occurring within word embeddings by new techniques such as ELMo, ULMFiT, and the OpenAI transformer. As per Sebastian Ruder, if learning word vectors {e.g. Word2vec} is like only learning edges, these approaches are like learning the full hierarchy of features, from edges to shapes to high-level semantic concepts.” In essence these new techniques have a much richer semantic representation of words/sentences and thus enable the bot to understand words in a deeper way.

Possibly even more exciting is the idea that with these newer systems we may be able to build transferable pre-trained universal word/sentence embeddings that we can use with virtually any bot and achieve excellent comprehension and results, which sounds a lot like human intelligence!

3) Use an Ontology and Sentiment detection to Label the Text for Meaning

While word embeddings are one way to embed a text dataset with meaning, data labeling is the tried and true method used in other AI domains such as image recognition. (See our white paper on “Data Labeling Full-Text Datasets for AI Predictive Lift” for more comprehensive treatment of this topic.) The problem with data labeling is that in the past this has been done by humans and thus is very expensive. But automated data labeling for text is now a possibility, using an Entity Ontology and Sentiment Detection.

An entity ontology is like a dictionary and a thesaurus; its job is to define the meaning of words by: a) encoding commonalities between concepts in a specific domain (e.g. both “yellow fever” and “malaria” are “diseases spread by mosquitoes”), and b) encoding how words relate to concepts, when they vary depending upon the context (e.g. that Mercury is sometimes a “metal,” sometimes a “planet” and a sometimes a “Greek god”). Entity ontologies can be created and used to label a dataset with meaning at great cost using humans. But now these tasks can be fully automated. High quality ontologies can be generated using NLP and AI techniques. These ontologies can be further edited by domain experts (“human-in-the-loop”) and then used to label datasets in bulk or in real time (e.g. streaming).

Understanding text also requires a nuanced and micro understanding of sentiment. Document or even sentence level sentiment is essentially useless for AI. For example, “My neighbor’s garden is awesome, the vegetables are really fresh, but they also attract deer, which is how I got Lyme disease.” The bot needs to see the first part of the sentence as positive (e.g. the fresh vegetables produced by my neighbor’s garden are excellent), and the second part of the sentence as negative (e.g. getting Lyme disease because of my neighbor’s garden is awful), rather than as neutral (half good plus half bad).

Labeling datasets with ontologies and sentiment often result in a better chatbot than by using word embedding alone, as the ontology and sentiment detection capture additional meaning allowing the bot to achieve a more human-like understanding of the text.

Losing the Dunce Cap in 2019?

While I cannot be sure these newer techniques will make bots super-smart next month or next year, we do know that they are making conversational AI systems smarter all the time. If you are using these new techniques, we’d love to hear about how it’s working. Or if you want help moving your bot to the head of the class – give us a call.

Why Conversational AI's Great Expectations Met Dumb and Dumber

Conversational artificial intelligence is surrounded by a lot of hype and promise. The seeds of these great expectations were sown in the 60’s by Star Trek (1966) and 2001: A Space Odyssey (1968). We all want to ask the computer to help us with almost any imaginable task just like they do on TV and in the movies. Think back to the first time you talked to Alexa or Siri or Google Assistant. You were hoping for “HAL” like human conversations and maybe a little afraid of falling in love as Theodore Twombly (Joaquin Phoenix), does in “Her.” But then reality smacked you in the face (cue the sound of a car crash.)

Conversational AI’s roller coaster ride

Conversational AI’s roller coaster ride

As the Nielsen and Norman Group aptly point out “people are learning that ‘intelligent’ assistants are actually not that smart, ” they also note that “{people} simply avoid usability agony by limiting their use to a subset of simple features.” Yup – I think that sums it up pretty well. Siri, Alexa and Google Assistant are pretty good at simple tasks, but falling in love? Not gonna happen.

So why is the reality so far from our expectations? To be blunt, (as with human relationships that just did not work out) it’s because they just cannot understand us. Why is it that IBM’s AI Watson can defeat top Jeopardy! players, but for the clue 'What do grasshoppers eat,' Watson answered: 'Kosher.' And as Gregory Barber of Wired points out, while we’ve made great of progress in AI image recognition, “understanding language…has proved elusive.”

But this is beginning to change, as we saw at this year’s Google I/O conference, where Google demonstrated a flawless Google Assistant booking an appointment at a hair salon, and then making a restaurant reservation. The humans on the other end of the call had no idea they were talking to a chatbot. But, and there is a big but, Google explained that this was only possible because the scope of the problem was narrow and the training was intensive. Google noted that Google Assistant was not ready to “carry out general conversations.”

So, through daily experience, people have had their expectations surrounding virtual assistants reset. We’ve learned to converse only around simple tasks. I call this hitting rock bottom. The good news is, that just like when a roller coaster hits bottom, we are now starting the long expected ride up to the top, where the fun can really begin.

In the next post, we will take a look at how cutting edge technologies are enabling machines to understand us better and why the hype around virtual assistants will eventually meet our expectations.

Conversational Systems / Chatbots: The Best Ways to Achieve Success

Or “the times they are a-changin….” (Bob Dylan)

Conversational systems (e.g. chatbots) divide into two camps:

  • Rules-based chatbots – this is the old guard that make minor use of AI (they currently dominate the market)
  • Intelligent chatbots – this is the emerging new guard that make intensive use of AI and are striving for human like intelligence


Lorie Shaull (Wikimedia)

Let’s examine how to achieve success in each camp.

Rules-based Chatbots

Rules-based chatbots are the reigning world champs of conversational systems and are the Rocky Balboas of the industry. They achieve their capabilities through lots of hard work, time and effort. They have no actual smarts but mimic intelligence via rules and programming. Some of these systems can function at a very acceptable level, but they are rarely confused with a person. In fact, informing the human upfront that they are dealing with a robot is the generally accepted practice because it’s a real embarrassment to trick someone at the beginning of the conversation and then to have the robot fall flat on its face when asked an unexpected question.

The weakness of a rules-based chatbot is ultimately its greatest strength. If you are not trying to be a human, then it’s okay to admit it and focus on what you do well.

  Achieving success with rules-based conversational systems:

a) Use the 80/20 rule to focus the chatbot. Understand your call volume and focus the chatbot on your high volume simple requests. If you can off-load 10% or 30% of the total call volume by answering the repetitive, low-value questions it’s a huge win.

b) Admit defeat and keep the humans happy. Do not worry about handling the tough problems. People do this well and chatbots (especially rules-based ones) do this poorly. So as soon as your confidence level on a response drops below a reasonably high level – route the person to a service rep. This makes them happy and keeps the chatbot from getting embarrassed.

A colleague of mine just built an Alexa skill for a municipality. The chatbot was focused on the frequently asked, but easy to answer questions (such as “What is the recycling schedule?”). It will be a huge win for both residents and the town.

Intelligent Chatbots

We are now on the verge of a sea change in conversational systems from rules to understanding. Think of this as the Dick Fosbury of chatbots - a whole new approach to conversational systems. Rather than mimic understanding with rules, intelligent chatbots attempt to use AI and machine learning to understand a domain at a deep enough level so they can handle questions and provide high-quality responses without rules and programming.

An early example of this shift was on display at this year’s Google I/O conference. At the conference, Google CEO Sundar Pichai demonstrated Google Assistant running Duplex, Google’s experimental AI voice system. The demonstration consisted of two parts: first, Google Assistant booked an appointment at a hair salon, and then made a restaurant reservation. The demonstration was flawless and the humans on the other end of the call were clearly fooled and had no idea they were talking to a chatbot.

In their AI blog, Google explained that “one of the key research insights was to constrain Duplex to closed domains, which are narrow enough to explore extensively. Duplex can only carry out natural conversations after being deeply trained in such domains. It cannot carry out general conversations.”

Informatics4AI is working with a number of customers focused on intelligent conversational systems, and we fully agree with Google’s engineers.

Achieving success with intelligent conversational systems:

  • Focus the chatbot and constrain the domain. Do not attempt to train your chatbot over an open/expansive domain. Focus the system on a specific topic and you may be able to get the intelligence you are seeking. For some customers with open domains, we are experimenting with the creation of subdomains to enable learning on focused topics and a triage-bot to help direct humans to the appropriate subtopic.
  • Better data makes a better model. You will need a high-quality dataset for training the chatbot. Labeling your full-text for meaning will assist the required deep training and you will achieve better results. (See this post for more information of labeling full-text for meaning.)

Are you thinking about a truly intelligent bot? Let us know your experiences as well as your successes and failures. We also welcome your questions.

Secrets to Achieving Predictive Lift When Using AI on Full-Text

As stated in the previous post in this series “Extraction of meaning — or more specifically, semantic relations between words in free text — is a complex task.” Complex enough that machines can find full-text hard to understand and therefore building good models is difficult. Standard NLP techniques (such as tokenization, stemming, parts-of-speech tagging, parsing, named entity recognition, etc.) can improve the model but often prove inadequate.

The following advanced techniques are examples of the types of steps that can be applied to unstructured full-text datasets to add structure and meaning in order to produce better models and significantly increase predictive lift, which go well beyond standard NLP techniques. This post will focus on three examples of advanced techniques:

  • Labeling nouns and noun phrases for meaning
  • Extracting sentiment (most often) from adverbs and adjectives
  • Extracting intent from verbs

The use of these techniques is similar to a person using a compass or GPS, which provide the machine with a way to navigate and understand the text.


1) Labeling nouns and noun phrases for meaning

Much of the meaning in text is stored in nouns and noun phrases. Unfortunately, machines don’t know that dogs and cats are animals often living with humans as pets. However, the machine can learn these types of things via the creation of a semantic ontology of entities.

The purpose of the semantic ontology is to: a) encode commonalities between concepts in a specific domain (e.g. both “ticks” and “mosquitoes” are “disease spreading insects”), and b) to encode that certain words relate to different concepts depending upon the context (e.g. that “mercury” is sometimes a “metal,” sometimes a “planet” and a sometimes a “Greek god”). The entity ontology is a semantically unambiguous dictionary that enables the machine to learn much faster and more accurately than simply processing the ambiguous raw text.

For example, take full-text medical records involving a cardiologist, Dr. Smith. When examining thousands of medical records Dr. Smith frequently occurs in records about heart attacks. When processing raw medical records the AI model will often “overfit” and may think Dr. Smith is a cause of heart attacks, rather than the attending doctor. The entity ontology can prevent this and help the machine to understand the difference between doctors and the factors that can cause heart attacks.

A key problem with building ontologies is that to date most have been created by humans, which is both costly and time-consuming. This is no longer true. Advanced NLP systems can now create a semantic entity ontology in an automated way, simplifying, speeding and reducing the cost of this critical step.

A word of caution, some systems/vendors create what is known as an orthogonal ontology where nouns and noun phrases are only placed in one concept. While this may be acceptable in some applications, in others it may be highly problematic. Please review the type of ontology being created and how it will be used before you invest heavily in its creation.

2) Extracting sentiment (most often) from adverbs and adjectives

Sentiment analysis is the process of determining whether a piece of writing is positive, negative or neutral. Standard NLP systems claim to enable sentiment detection, but most fall far short of the type need to truly help machines learn in all but the most straightforward situations. When using sentiment analysis for AI, document-level sentiment is essentially useless. AI requires sentence and often phrase-level sentiment. For example “The x-ray revealed good news regarding tumor reduction, but also, unfortunately, revealed advanced pneumonia.” For AI, we want to be able to see the first part of the sentence as positive, the second part of the sentence as negative, and we want to identify that “advanced” is an intensifier and therefore really negative. Lastly, we do not want the sentence to be classified as neutral (half good plus half bad), we want to see the sentence as two distinct thoughts one positive and one negative.

Properly detecting sentiment is tricky, but it can change the entire meaning of a sentence and therefore it is often critical in AI applications. For example, a sentence stating “we saw evidence of…” and “we saw scant evidence of” are nearly identical. Yet the entire meaning of the sentence hinges on this one word – “scant.” Note, “scant” would not be labeled by a semantic entity ontology, as scant is an adjective. So while nouns contain most of the meaning, advanced sentiment analysis, e.g. how the nouns (or verbs) are modified, can be critical to the machine truly understanding the full text. Armed with superior sentiment analysis, data scientists can vastly improve the accuracy of their model.

3) Extracting intent from verbs

Datasets are usually about a specific domain (e.g. medical, financial, tech support, e-commerce, etc.). Most domains have a unique and specific set of actions or intents, as Townsend and Bever (2001) remarked, in everyday life, “most of the time what we do is what we do most of the time.” For example, in Weather, the number of user intents is quite low. I want to know the current weather – is by far the most common intent. So training a machine to understand a phrase such as “What is the weather in London?” is easy. In other domains, the number of intents can be quite large and the NLP must be trained more carefully. For example, in e-commerce the intent – “a return” might be characterized many ways – e.g. “take back”, “want my money back”, “don’t want”, “don’t need,” etc. 

The following capabilities are important when evaluating intent detection:

  • Can you train your NLP to understand a domain-specific set of intents?  (By this I mean, not just build a hand-coded set of intents, but build an ML assisted understanding of intents.)
  • Can you build rules with whitelisted or blacklisted words to supplement the learned intents?
  • Does your system have a way to disambiguate intents between the intender and the intendee?

Standard NPL libraries cannot do the above, however, if your NLP system has all of these capabilities you will be able to build a robust understanding of user intents and thus build a better model.

4) Putting it all together

Let's take a look at a few complex sentences.

“The image showed a reduction in the size of the mass. This news came as a devastating blow, as it was understood by the patient that she was in total remission.”

In summary, with the proper pre-processing and NLP feature extraction, the model will be able to understand that:

  • The tumor is getting smaller, which is diagnostically very good news.

Rather than possibly getting tricked into believing that:

  • The patient is in total remission, and the sentiment is neutral.

Let's examine a few more details:

  • An entity ontology will help the machine understand that “mass” is the same as a tumor, even if the word “ mass” is rarely used in the dataset.
  • Sentiment analysis would help the machine understand that we have marginally good medical news (about tumor size) and that the patient is upset.

Advanced NLP techniques such as these are critical to creating better data, from which machines can build better models. Is your model finding full-text hard to understand? Give us a call, we have tools and techniques that can help.

Why Machines (AI) find full-text hard to understand?

In this multi-post topic, we examine the problem and reveal the secrets for successfully training AI models on full-text datasets. First, let’s understand how hard this is and why?

Two people talking v4.jpg

The following statement by Indrek Vainu CEO of AlphaBlues, an enterprise chatbot company, summarizes the current situation. “Extraction of meaning — or more specifically, semantic relations between words in free text — is a complex task. The complexity is mostly due to the rich web of relations between the conceptual entities the words represent.” He goes on to say that machine learning is “largely clueless when fed unstructured data, such as free text.”  

IMImobile, another chatbot company states that “Machine learning is a powerful technology and promises an exciting future where machines can come to understand our needs and our intent, perhaps better than we do ourselves. However, at this moment in time we only recommend machine learning for scenarios where there is little scope for ambiguity, and where vectorisation (converting non-numeric input to numeric inputs) is straightforward.” 

A recent customer engagement at Informatics4AI supports these statements. Our customer was working with a dataset comprised of unstructured doctor's notes. They found their machine learning efforts created a model that was highly effective for straight forward diagnostic situations (e.g. a patient passing a common screening test). But when fed notes relating to complex tests and multiple patient conditions, the model did not produce predictions with the accuracy that they needed. 

As an illustration of the difficulty that AI has with full text (and for a bit of fun) let's take a look at the results that Janelle C Shane got when she trained a neural network on a database of about 30,000 recipes and then asked the machine to produce a new recipe: 


2 pkg hershey’s can be prepared in unpeeled

1 smaller

½ cup yellow onions you may

1 cup egg; chilled, coursely chopped

½ lb bacon, chopped

1 ½ cup sugar, grated

4 oz square oil

Halve the finely chopped fresh garlic salt and pepper. Break the meat into the pineapples and pat them, scraping the room off the skillet. Add ghees and beer and bring to a boil; cover and simmer, uncovered, on High for 20 to 30 minutes or until the onion thickens.

To be fair and to clarify, this model was built by an AI enthusiast and not a AI professional, but I think it illustrates the issue – the machine has no clue what a recipe is really all about. 

However, all is not lost when trying to apply machine learning to full text. The key is adding structure and meaning to the raw data, and by doing so, enable the machine to understand the text and thus begin to learn. We will review these techniques in the next blog post

Insight Engines (Next Generation Search Engines using AI)

This is part three of a three-part blog post. (See part-one and part-two here.)

“Insight Engine” is a new term popularized by research firms such as Gartner and Forrester, and represents the next step in information discovery. The general idea of an Insight engine is to dramatically improve user access to relevant content and to make access to that content frictionless. While definitions vary, I define these engines as using AI techniques to offer three core capabilities generally lacking in search engines:

  • Increasing user access to all your content no matter where it resides
  • Improving the relevance of search results by using personal relevancy algorithms
  • Enabling next generation push/alerts

These capabilities have the ability to move enterprise search/knowledge management from a state where users spend significant time everyday trying to find information, to one where information is almost magically at one's disposal. Let’s take a closer look.

Access to Content

Insight engines dramatically change the dynamics of accessing content through the use of natural language processing. As we have all experienced, every search engine has its own unique syntax. Sometimes that difference is subtle, such as Google vs Bing. But other times the query syntax is incomprehensible to most users (e.g. native SQL queries) and thus the content becomes inaccessible. Insight engines tackle this problem by:

  1. Allowing users to enter a query using natural language. For example - “How many units did we sell today?”
  2. Parsing the query (using natural language processing) and then reformulating it in the native query syntax of the repository.
  3. Returning the results in a format useful to the user (e.g. a spreadsheet).

With this type of power users can query any repository for any data, giving them a true 360-degree view of the available content and thus enabling better decision making.

In the knowledge management world, this is a home run.

Improving Relevance

As we discussed in the previous post, context can be a game changer in relevancy as demonstrated by Google’s use of PageRank to dramatically improve the relevancy of internet search. Unfortunately, PageRank does not apply to enterprise search, and so to date, enterprise search engines have largely ignored context. Insight engines change that by using AI to analyze user behavior (e.g. previous queries, previous downloads, time spent on various articles, etc.) and then incorporating this user-specific context when computing search relevance. The results can be stunning, as each user now has a personal relevancy algorithm, one that is built on an analysis of data they have consumed in the past. This is similar to when you search for “cameras” on the internet and then later in the day you get personalized camera ads, but on steroids.

For you geeks out there, this is a form of Learning to Rank (LTR) where relevancy is customized by an AI model. The training data is your past search and content consumption behavior and the model seeks to predict what content you will like the best.

In the knowledge management world, this is second a home run.

Next Generation Push / Alerts

Insight engines also tackle another huge issue. A great deal of research has been done documenting that knowledge workers spend a significant portion of the day looking for information and often meet with failure. As if that isn’t bad enough, imagine how much information is missed altogether because we didn’t spend all day every day updating our searches and watching newsfeeds for changes in our industry, etc.

Insight engines seek to significantly reduce the need for users to search by enabling next-generation push/alerts. Insight engines use AI to analyze users past content consumption and then, in the background, find new relevant content and deliver it to the user. This means users get awesome, up-to-date, targeted content without even asking for it.

In the knowledge management world, this is a third home run.

A Bit of Caution

Insight engines are definitely here. But please know that:

  • Not all these capabilities are available from all Insight engine vendors.
  • Not all of these capabilities are as far along as the hype would lead you to believe.

But, if you are looking for that next leap forward, you should certainly look at Insight engines and vendors including: Coveo, Attivio, Sinequa, Lucidworks, as well as offerings from the big boys Microsoft, IBM, and HP.

Have you looked at Insight engines? If so, drop me an email and tell me your story.

Modern Search Engines

This is part two of a three-part blog post. (See part-one here.)

As we discussed in the previous post, index engines were developed to make searching across large textual repositories fast. But once high-speed retrieval was achieved, a new problem occurred – users were unable to find the most relevant/interesting documents within a large set of search results. The obvious answer/solution was to rank the documents by relevancy and present the most relevant results first.

As we saw in the previous post, indexing products often have problems with relevancy, giving birth to modern search engines, which improve relevancy by using two key techniques. Let’s take a look.


Enhancing Relevance with Context

Google is by far the best example of using context to improve relevance. Early internet search engines largely ranked results by counting the number of times the search terms appeared on the page. Google took a new approach to relevance by introducing the page’s importance (called PageRank) into ranking search results. (PageRank is loosely based on how many other websites linked to that page.) The inclusion of context into the relevance ranking had a huge and dramatic effect enabling Google to leapfrog their competitors Yahoo and Lycos, largely because users found Google’s searches so much better.

While PageRank is an awesome piece of context on the web, it does not work inside an organization. So enterprise search engines developed other techniques to improve relevance.

Enhancing Relevance with Tuning

Modern enterprise search engines attempt to address the problem of poor relevance by making the relevance calculation tunable for an organization’s specific set of circumstances. For example, Elastic Search (a widely used open source search engine based on Lucene) has many options for tuning relevancy. A few examples (all of which were critical to the organization described in the previous post) include:

Commonly Used Adjustments/Techniques

  • Field Boosting – used to boost the relevance of documents when the search term is in a field such as “Title” as opposed to buried on page 24.
  • Time Boosting – used to make more recent items more relevant and therefore is very helpful in applications like news or research.
  • Search Term frequency saturation – used to ensure that a large document does not dominate all others just because it contains more search terms.

Specialized Adjustments/Techniques

  • Location Boosting – used to increase the relevance of items that are close.
  • Price Boosting – used to increase or decrease relevancy based on price (often critical for e-commerce applications).
  • Boosting by Popularity – used to increase the relevance based on data from another field such as a popularity rating  (like Google, context is being used to increase relevance).

Modern search engines are much more tunable than indexing engines, and therefore often produce a much better search experience for the user. We recommend that when organizations adopt a search engine, they include relevancy tuning as part of the project to ensure a custom fit for their needs. However, once the low hanging fruit has been done (e.g. Boost the Title field), we strongly recommend against some organization’s desire to continue tweaking relevance daily. As Elastic notes “relevancy tuning is a rabbit hole that you can easily fall into and never emerge.” We advise organizations to visit tuning regularly but infrequently, and only doing so when they have the proper instrumentation and monitoring in place to know if you are increasing or decreasing relevance. Again, according to Elastic, you should monitor relevance by keeping track of items such as “how often your users click the top result, the top 10, and the first page; how often they execute a secondary query without selecting a result first; how often they click a result and immediately go back to the search results, and so forth.” With these objective measures in hand, you can clearly understand how relevance tuning is affecting users search experience.

The final post in this three-part series will discuss next generation search and why some organizations are already reaping big benefits from Insight engines.

Indexing to Search to Insight (& Artificial Intelligence)

What a Long Strange Trip it’s Been

We just completed a few assignments for clients seeking to improve the performance of search in key applications within their organizations. These assignments reminded me of the long and winding road search has been on for the past 30 years, together with some valuable lessons we’ve learned along the way.

windy road.jpg

I thought it would be useful to share these lessons in a series of blog posts where we will compare and examine the following:

  • Indexing engines (the first generation of search engines)
  • Modern search engines
  • Insight engines (the next generation search engine using artificial intelligence)


First Generation Search: Indexing

If you can believe it, back in the good old days before the internet (e.g. prior to the release of Netscape Navigator in 1994), full text search capabilities were not generally pervasive. Search was available in on-line information systems (e.g. Dialog) but inside organizations, databases were still using string matching (e.g. a SQL LIKE statement) and this meant word search was very slow, especially when the database was very large.

This slowness gave rise to a series of products and capabilities, which solved this problem by indexing words. Over time these indexing capabilities were embedded in database products enabling fast keyword retrieval over large databases of virtually any size.

We did some recent work with a customer that was using an indexing product in which they had built a document repository of about 100,000 PDFs. The application had been doing its job for many years, but more recently users and the content administrators (information professionals in the research center) had noticed a few problems. Most importantly, they were frustrated that the relevance ranking of the search results seemed poor.

How is Relevance Determined

Indexing products (including advanced ones like MS SQL Server full text search) generally compute indexing using a combination of the following:

  • The frequency of the search terms within a record (e.g. how often do search terms occur in a document – more means more relevant).
  • The ‘density’ of the search terms within a record (e.g. the number of search terms divided by the total number of words in a record).
  • How common the search terms are in the database as a whole (matching on a rare word is more important than matching on a common word).

This type of relevance ranking algorithm works pretty well, and when combined with Boolean search operators and parametric/fielded search, can yield excellent results. But it is far from perfect.

Problems with Relevance

In fact, the customer sighted three issues they were experiencing with relevance:

  • Problem #1 - Big documents (like the thousand page annual industry handbook) always seemed to be at the top the search results.
  • Problem #2 - Documents where the search term was in the title field were not deemed more relevant than documents where the search term was on page 24.
  • Problem #3 - More recent documents were not deemed more relevant that much older documents on the same topic.

A general problem with Indexing products is that the built-in relevancy algorithm is generally a take it or leave it proposition. It cannot be tuned or adjusted. This is a major difference between indexing products and modern search engines.

When we discussed next steps with the client, we quickly moved into a discussion of search engines, which is the subject of the next blog post.