Context-Based Automated Conversational System Using a Pretrained Model

Follow this guide to create a conversational system with a pretrained LLM in Paperspace.

9 months ago   •   7 min read

By Adrien Payong

Sign up FREE

Build & scale AI models on low-cost cloud GPUs.

Get started Talk to an expert
Table of contents

Bring this project to life

Introduction

After-sale support is crucial for any company looking to provide a 5-star customer care experience. Up until recently, it was effective to provide a trustworthy customer care contact center to resolve customers’ concerns and issues. Customers in the current day expect faster service and more ease of use due to the prevalence of increasingly sophisticated technological options. Limited personnel provides significant challenges in terms of speed and quality of service to customers. Chatbots, if developed correctly, can resolve most of these issues.

Definition and statistics

To begin, what exactly is a chatbot? A chatbot is an AI program that mimics human conversational behavior through audio or textual means. A bot is a computer program that can converse with humans using natural language. A chatbot (chat+robot) is an artificial intelligence (AI) program designed to replicate human dialogue.

We can read from Finance Digest:

Indeed, Servion predicts that, by 2025, AI will power 95% of all customer interactions, including live telephone and online conversations that will leave customers unable to ‘spot the bot’

And we can also read from Grand View Research

The global chatbot market size was valued at USD 525.7 million in 2021 and is expected to expand at a compound annual growth rate (CAGR) of 25.7% from 2022 to 2030. The market is expected to be driven by the increasing adoption of customer service activities among enterprises in order to reduce operating costs.
Various innovations carried out in artificial intelligence and machine learning technologies are expected to enhance the features of chatbots. This, in turn, is expected to drive market growth in the coming years.

Different types of Chatbots

A chatbot may be created in a variety of ways. The answer is conditional on the nature of the issue it is meant to solve and the information at hand. In light of these factors, we may categorize chatbots into the following categories:

Chatbots that follow a set of predefined rules

It relies heavily on predefined rules, and any question asked outside of those parameters will be met with a predetermined answer, proving the bot’s inability to process the user’s intent. These are terms we’ve already told the bot to look for. When coding using regular expressions or another text analysis tool, these instructions must be stated explicitly. Simple as it is, it solves most issues with routine actions like purchase cancellations and refund requests.

Generative Chatbots

These chatbots are state-of-the-art applications of deep learning to the task of understanding their surroundings and responding appropriately. Although there are no “model” sentences, you should be able to provide adequate responses to most of the questions. Because of all the challenges in this field, we are not yet able to create a flawless chatbot. But this is a lively field of study, and we can anticipate improving findings in the future.

On the basis of their function, we can further categorize chatbots into two distinct categories.

  • Chatbots that work horizontally: In this context, “horizontal chatbot” refers to a bot that is both open-ended and broad in scope. These chatbots are only useful for broad, overarching tasks; they are not yet capable of doing the fine-grained work required by specific domains. It’s the foundation upon which most specialized bots can operate.
  • Chatbots in the vertical orientation: In contrast to horizontal chatbots, which may be useful in a variety of sectors, vertical chatbots are limited to a single industry. As an example, we are developing a chatbot to help physicians get answers to their inquiries about available medical supplies. Unfortunately, this isn’t suitable for usage in the IT sector.

Given the variety of chatbot categories, there is a wide range of possible implementations. Both vertical and horizontal chatbots may be built with the help of available frameworks. But let’s take it a step further and study how to build these chatbots using NLP from the ground up. Let’s not even talk about rule-based chatbots, which are widely available and simple to implement.

Conversational AI relies on a pretrained model to understand context

This is versatile since it is trained for a long period on a large data set using Graphics Processing Units (GPUs). Let’s pretend, however, that we don’t have the means to carry this through. The idea of transfer learning then follows. Let’s just use a model that’s already been pre-trained to figure this out.

Hugging Face Transformers

Let’s do this using the most cutting-edge library for Hugging Faces available today. The transformers are an open-source library with pre-trained models that can be easily downloaded and used in subsequent projects. It’s simple to use, and the outcomes are excellent.

Bring this project to life

So let’s install:

pip install transformers

Next up is model selection. The BERT design is well-known for its ability to provide excellent contextual results. So, let’s have a look at one of the pretrained BERT models available on Hugging Face.

First, the model and tokenizer should be brought in.

#import model and tokenizer
from transformers import AutoTokenizer
from transformers import AutoModelForQuestionAnswering
import torch

Now, we can load the model. This pretrained model can be easily swapped out for any other available on the Hugging Face website.

#pre trained Model loading
## return_dict=True.  If set to True, the model will return a ModelOutput
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad", return_dict=True)
#loading the Tokenizer 
## A class with the appropriate architecture will be automatically generated usingAutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")

The requirement for a text in question-asking is obvious. Taken from Wikipedia, here is a passage we will use.

text="""Leonardo di ser Piero da Vinci[b] (15 April 1452 – 2 May 1519) was an Italian polymath of the High Renaissance who was active as a painter, draughtsman, engineer, scientist, theorist, sculptor, and architect.[3] While his fame initially rested on his achievements as a painter, he also became known for his notebooks, in which he made drawings and notes on a variety of subjects, including anatomy, astronomy, botany, cartography, painting, and paleontology. Leonardo is widely regarded to have been a genius who epitomized the Renaissance humanist ideal,[4] and his collective works comprise a contribution to later generations of artists matched only by that of his younger contemporary, Michelangelo.[3][4]

Born out of wedlock to a successful notary and a lower-class woman in, or near, Vinci, he was educated in Florence by the Italian painter and sculptor Andrea del Verrocchio. He began his career in the city, but then spent much time in the service of Ludovico Sforza in Milan. Later, he worked in Florence and Milan again, as well as briefly in Rome, all while attracting a large following of imitators and students. Upon the invitation of Francis I, he spent his last three years in France, where he died in 1519. Since his death, there has not been a time where his achievements, diverse interests, personal life, and empirical thinking have failed to incite interest and admiration,[3][4] making him a frequent namesake and subject in culture.

Leonardo is identified as one of the greatest painters in the history of art and is often credited as the founder of the High Renaissance.[3] Despite having many lost works and less than 25 attributed major works-including numerous unfinished works-he created some of the most influential paintings in Western art.[3] His magnum opus, the Mona Lisa, is his best known work and often regarded as the world's most famous painting. The Last Supper is the most reproduced religious painting of all time and his Vitruvian Man drawing is also regarded as a cultural icon. In 2017, Salvator Mundi, attributed in whole or part to Leonardo,[5] was sold at auction for US$450.3 million, setting a new record for the most expensive painting ever sold at public auction."""

Let’s create a function that will accept queries from the user, run them through the text, and return predicted results.

## answer a user's inquiry using a specified function 
def chat_ans(input_question):
# texts tokenization with encode_plus. ## return_tensors = "pt means you will return pytorch tensor
    input_token = tokenizer.encode_plus(input_question, text, return_tensors="pt")
#obtaining scores from tokens 
## by providing return_dict=False, you may compel the model into returning a tuple: 
    rep_str, rep_en = model(**input_token,return_dict=False)
    #getting the position
## Find the beginning of the answer that is most likely to be correct using the argmax of the score. 
    pos_start = torch.argmax(rep_str)
## Find the end of the answer that is most likely to be correct using the argmax of the score.
    pos_end = torch.argmax(rep_en) + 1
#tokens conversion of id using the function convert_ids_to_tokens()
    rep_token = tokenizer.convert_ids_to_tokens(input_token["input_ids"][0][pos_start:pos_end])
#We get the response
    return tokenizer.convert_tokens_to_string(rep_token)

To test it out, let’s ask a few of questions.

question = "when did Leonardo di ser Piero da Vinci born"
chat_ans(question)
## output 15 april 1452
question = "where does he receive its education"
chat_ans(question)
## output Florence'
question = "who was Leonardo di ser Piero da Vinci"
chat_ans(question)
## out put an italian polymath of the high renaissance
question = "what are his achievements"
chat_ans(question)
## output  painter, draughtsman, engineer, scientist, theorist, sculptor, and architect

These are some good responses, right? We can test out several pre-trained models and evaluate how they perform against one another.

Conclusion

Within the scope of this tutorial, we investigated one approach to developing a chatbot by using the capabilities of natural language processing . This tutorial describes the chatbot’s back end; The reader can, perhaps, investigate the possibility to integrate it with a front end.

There is ongoing study in this area. Given that chatbots have previously shown their worth, they have a massive potential customer base.
It will continue to grow in the years to come as more and more vertical and horizontal apps help simplify the shopping experience for the end user.

Add speed and simplicity to your Machine Learning workflow today

Get startedTalk to an expert

Reference

This tutorial was inspired from the book Natural Language Processing Projects by Akshay Kulkarn

Section Context-based Chatbot Using a Pretrained Model from the book Natural Language Processing Projects by Akshay Kulkarni

Spread the word

Keep reading