A Guide to Building Medical Chatbot Using MedAlpaca

In this blog, we will build a Streamlit-based web application that integrates a language model (LLM/MedAlpaca) to create an interactive chatbot experience. This guide will show you the process of compiling and understanding the application. Before we start, let’s understand a few key concepts:

LangChain

LangChain is a Python library designed for natural language processing tasks, particularly focused on language modeling. It provides tools and utilities for working with language models, defining prompts, handling callbacks, and managing model interactions.

The key features are:

Language Model Interaction: LangChain facilitates interaction with various language models, allowing developers to integrate them into their applications seamlessly.
Prompt Templating: The library offers functionality for defining prompt templates, enabling users to structure inputs and customize interactions with language models.
Callback Handling: LangChain includes utilities for handling callbacks, enabling users to intercept and process streaming output from language models in real-time.
Integration with Streamlit: LangChain integrates with Streamlit, a popular Python library for building web applications, making it easy to develop interactive chatbot interfaces and other NLP-powered applications.

MedAlpaca

MedAlpaca refers to a specific pre-trained language model hosted on the Hugging Face model hub. It is designed for general-purpose natural language processing tasks and is available for public use.

Its key characteristics are:

Model Architecture: MedAlpaca is based on a specific architecture tailored for handling large-scale language modeling tasks, utilizing transformers and attention mechanisms.
Model Size: MedAlpaca is a large-scale model, containing billions of parameters, allowing it to capture intricate patterns and nuances in natural language data.
Fine-tuning and Transfer Learning: Users can fine-tune MedAlpaca on medical domain-specific data or use it for transfer learning tasks, leveraging its pre-trained knowledge to improve performance on specific tasks.
Availability: MedAlpaca is accessible through the Hugging Face model hub, allowing developers and researchers to download and utilize the model in their applications via simple API calls or library integrations.

Both LangChain and MedAlpaca play essential roles in natural language processing workflows, offering developers powerful tools and resources for building advanced language-centric applications and systems.

Let’s Start Building

Prerequisites on Your System

Before starting, ensure you have the following prerequisites installed on your system:

Python 3.6 or higher
Streamlit
Hugging Face Transformers Library
LangChain Library

Installation

Clone the repository containing the source code or download the source files.

Install the required Python dependencies by running the following command in your terminal:

pip install streamlit transformers langchain huggingface_hub

‍

# Import necessary libraries
import streamlit as st
from langchain.llms import LlamaCpp
from langchain.prompts import PromptTemplate
# from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.base import BaseCallbackHandler
from huggingface_hub import hf_hub_download

Import Required Libraries

The script begins with setting the encoding to UTF-8 and includes a docstring providing information about the creation date and author.

Libraries such as Streamlit, LlamaCpp, PromptTemplate, BaseCallbackHandler, and hf_hub_download are imported. These libraries are essential for building the chatbot application.

# StreamHandler to intercept streaming output from the LLM.
# This makes it appear that the Language Model is "typing"
# in realtime.
class StreamHandler(BaseCallbackHandler):
   def __init__(self, container, initial_text=""):
        self.container = container
        self.text = initial_text
   
   def on_llm_new_token(self, token: str, **kwargs) -> None:
        self.text += token
        self.container.markdown(self.text)

Define a Custom Class

Defines a custom class StreamHandler which inherits from BaseCallbackHandler. This class is used to intercept streaming output from the language model (LLM) and display it in real-time on the chat interface.

@st.cache_resource
def create_chain(system_prompt):
# A stream handler to direct streaming output on the chat screen.
# This will need to be handled somewhat differently.
# But it demonstrates what potential it carries.
stream_handler = StreamHandler(st.empty())
# Callback manager is a way to intercept streaming output from the
# LLM and take some action on it. Here we are giving it our custom
# stream handler to make it appear that the LLM is typing the
# responses in real-time.
callback_manager = CallbackManager([stream_handler])
(repo_id, model_file_name) = ("TheBloke/medalpaca-13B-GGUF", "medalpaca-13b.Q4_K_M.gguf")
model_path = hf_hub_download(repo_id=repo_id, filename=model_file_name, repo_type="model")
# initialize LlamaCpp LLM model
llm = LlamaCpp(
     model_path=model_path,
     temperature=0,
     max_tokens=512,
     top_p=1,
     stop=["[INST]"],
     verbose=True,
     streaming=True,
     )

# Template for structuring user input before converting into a prompt
template = """
<s>[INST]{}[/INST]</s>
[INST]{}[/INST]
""".format(system_prompt, "{question}")

# Create a prompt from the template
prompt = PromptTemplate(template=template, input_variables=["question"])

# Create an llm chain with LLM and prompt
llm_chain = prompt | llm

return llm_chain

Initialize the LlamaCpp LLM model

Defines a function create_chain(system_prompt) which creates the language model (LLM) chain for the chatbot.
Downloads the pre-trained LLM model from the Hugging Face Hub using the hf_hub_download function.
Initializes the LlamaCpp LLM model with specific parameters such as model path, temperature, max tokens, etc.
Defines a template for structuring user input before converting it into a prompt.
Creates a prompt from the template and creates an LLM chain by combining the prompt and LLM.

# Set the webpage title
st.set_page_config(
   page_title="Your own aiChat!"
)
   
# Create a header element
st.header("Your own aiChat!")

# Set the system prompt for the chatbot
system_prompt = st.text_area(
  label="System Prompt",
  value="You are a helpful AI assistant who answers questions in short sentences.",
  key="system_prompt")
  
# Create LLM chain for the chatbot
llm_chain = create_chain(system_prompt)

Set Webpage Title

Sets the webpage title and creates a header element for the chat interface using Streamlit functions.
Creates a text area for users to input the system prompt, which defines the chatbot's personality and behavior.
Calls the create_chain function to create the LLM chain based on the specified system prompt.

# Initialize session state variables
if "messages" not in st.session_state:
    st.session_state.messages = [
         {"role": "assistant", "content": "How may I help you today?"}
    ]
         
if "current_response" not in st.session_state:
    st.session_state.current_response = ""
    
# Loop through each message in the session state and render it
for message in st.session_state.messages:
   with st.chat_message(message["role"]):
      st.markdown(message["content"])

Initialize Session

Initializes session state variables to store chat messages.
Loops through each message in the session state and renders it using Streamlit's chat_message and markdown functions.

# Take user input from the chat interface
if user_prompt := st.chat_input("Your message here", key="user_input"):
    # Add user input to session state
    st.session_state.messages.append(
         {"role": "user", "content": user_prompt}
    )
    # Pass user input to the LLM chain to generate a response
    response = llm_chain.invoke({"question": user_prompt})
    
    # Add LLM response to session state
    st.session_state.messages.append(
         {"role": "assistant", "content": response}
    )        
    # Render LLM response in the chat interface
    with st.chat_message("assistant"):
         st.markdown(response)

Chat UI

Takes user input from the chat interface using Streamlit's chat_input function.
Adds the user input to the session state.
Passes the user input to the LLM chain to generate a response.
Adds the LLM response to the session state and renders it in the chat interface using chat_message and markdown functions.

Usage

Open CLI and run the Streamlit application by executing the following command in your terminal:

streamlit run app.py

Once the Streamlit server is running, a new browser window/tab will open automatically, displaying the chat interface of ‘Your own aiChat!’.

Features

System Prompt: You can specify the initial personality or system prompt for the chatbot. This prompt sets the tone and behavior of the chatbot.
Chat Interface: The chat interface allows you to interact with the chatbot by typing messages.
Real-time Responses: Responses from the chatbot are displayed in real-time, giving the impression that the chatbot is typing.
Customizable: You can modify the system prompt to change the behavior of the chatbot dynamically.

How to Chat

Start by entering a system prompt in the provided text area. This prompt defines the personality or behavior of the chatbot.
Type your message in the ‘Your message here’ input box and press Enter.
The chatbot will process your message and generate a response based on the defined system prompt.
The response will be displayed in the chat interface in real-time.

Troubleshooting

Blank Responses: If the chatbot returns blank responses, try modifying the system prompt to provide more context or guidance.
Slow Response: If the chatbot responds slowly, consider adjusting the model parameters or optimizing the code for better performance.

Conclusion

This tutorial provides a guide to building a simple yet powerful platform to interact with language models in real time. Experiment with different system prompts to create unique chatbot personalities and engage in interesting conversations. Enjoy chatting with your own AI assistant!