As stated in the previous post in this series “Extraction of meaning — or more specifically, semantic relations between words in free text — is a complex task.” Complex enough that machines can find full-text hard to understand and therefore building good models is difficult. Standard NLP techniques (such as tokenization, stemming, parts-of-speech tagging, parsing, named entity recognition, etc.) can improve the model but often prove inadequate.
The following advanced techniques are examples of the types of steps that can be applied to unstructured full-text datasets to add structure and meaning in order to produce better models and significantly increase predictive lift, which go well beyond standard NLP techniques. This post will focus on three examples of advanced techniques:
- Labeling nouns and noun phrases for meaning
- Extracting sentiment (most often) from adverbs and adjectives
- Extracting intent from verbs
The use of these techniques is similar to a person using a compass or GPS, which provide the machine with a way to navigate and understand the text.
1) Labeling nouns and noun phrases for meaning
Much of the meaning in text is stored in nouns and noun phrases. Unfortunately, machines don’t know that dogs and cats are animals often living with humans as pets. However, the machine can learn these types of things via the creation of a semantic ontology of entities.
The purpose of the semantic ontology is to: a) encode commonalities between concepts in a specific domain (e.g. both “ticks” and “mosquitoes” are “disease spreading insects”), and b) to encode that certain words relate to different concepts depending upon the context (e.g. that “mercury” is sometimes a “metal,” sometimes a “planet” and a sometimes a “Greek god”). The entity ontology is a semantically unambiguous dictionary that enables the machine to learn much faster and more accurately than simply processing the ambiguous raw text.
For example, take full-text medical records involving a cardiologist, Dr. Smith. When examining thousands of medical records Dr. Smith frequently occurs in records about heart attacks. When processing raw medical records the AI model will often “overfit” and may think Dr. Smith is a cause of heart attacks, rather than the attending doctor. The entity ontology can prevent this and help the machine to understand the difference between doctors and the factors that can cause heart attacks.
A key problem with building ontologies is that to date most have been created by humans, which is both costly and time-consuming. This is no longer true. Advanced NLP systems can now create a semantic entity ontology in an automated way, simplifying, speeding and reducing the cost of this critical step.
A word of caution, some systems/vendors create what is known as an orthogonal ontology where nouns and noun phrases are only placed in one concept. While this may be acceptable in some applications, in others it may be highly problematic. Please review the type of ontology being created and how it will be used before you invest heavily in its creation.
2) Extracting sentiment (most often) from adverbs and adjectives
Sentiment analysis is the process of determining whether a piece of writing is positive, negative or neutral. Standard NLP systems claim to enable sentiment detection, but most fall far short of the type need to truly help machines learn in all but the most straightforward situations. When using sentiment analysis for AI, document-level sentiment is essentially useless. AI requires sentence and often phrase-level sentiment. For example “The x-ray revealed good news regarding tumor reduction, but also, unfortunately, revealed advanced pneumonia.” For AI, we want to be able to see the first part of the sentence as positive, the second part of the sentence as negative, and we want to identify that “advanced” is an intensifier and therefore really negative. Lastly, we do not want the sentence to be classified as neutral (half good plus half bad), we want to see the sentence as two distinct thoughts one positive and one negative.
Properly detecting sentiment is tricky, but it can change the entire meaning of a sentence and therefore it is often critical in AI applications. For example, a sentence stating “we saw evidence of…” and “we saw scant evidence of” are nearly identical. Yet the entire meaning of the sentence hinges on this one word – “scant.” Note, “scant” would not be labeled by a semantic entity ontology, as scant is an adjective. So while nouns contain most of the meaning, advanced sentiment analysis, e.g. how the nouns (or verbs) are modified, can be critical to the machine truly understanding the full text. Armed with superior sentiment analysis, data scientists can vastly improve the accuracy of their model.
3) Extracting intent from verbs
Datasets are usually about a specific domain (e.g. medical, financial, tech support, e-commerce, etc.). Most domains have a unique and specific set of actions or intents, as Townsend and Bever (2001) remarked, in everyday life, “most of the time what we do is what we do most of the time.” For example, in Weather, the number of user intents is quite low. I want to know the current weather – is by far the most common intent. So training a machine to understand a phrase such as “What is the weather in London?” is easy. In other domains, the number of intents can be quite large and the NLP must be trained more carefully. For example, in e-commerce the intent – “a return” might be characterized many ways – e.g. “take back”, “want my money back”, “don’t want”, “don’t need,” etc.
The following capabilities are important when evaluating intent detection:
- Can you train your NLP to understand a domain-specific set of intents? (By this I mean, not just build a hand-coded set of intents, but build an ML assisted understanding of intents.)
- Can you build rules with whitelisted or blacklisted words to supplement the learned intents?
- Does your system have a way to disambiguate intents between the intender and the intendee?
Standard NPL libraries cannot do the above, however, if your NLP system has all of these capabilities you will be able to build a robust understanding of user intents and thus build a better model.
4) Putting it all together
Let's take a look at a few complex sentences.
“The image showed a reduction in the size of the mass. This news came as a devastating blow, as it was understood by the patient that she was in total remission.”
In summary, with the proper pre-processing and NLP feature extraction, the model will be able to understand that:
- The tumor is getting smaller, which is diagnostically very good news.
Rather than possibly getting tricked into believing that:
- The patient is in total remission, and the sentiment is neutral.
Let's examine a few more details:
- An entity ontology will help the machine understand that “mass” is the same as a tumor, even if the word “ mass” is rarely used in the dataset.
- Sentiment analysis would help the machine understand that we have marginally good medical news (about tumor size) and that the patient is upset.
Advanced NLP techniques such as these are critical to creating better data, from which machines can build better models. Is your model finding full-text hard to understand? Give us a call, we have tools and techniques that can help.