Deploying Rasa with Efficiency: Leveraging Helm Charts

By Bharanidharan Boopalan Technical Blog October 17, 2024

Imagine owning a bustling restaurant with customers placing orders both online and in person. From time to time, customers will have questions or encounter issues, and handling these inquiries efficiently can enhance their overall experience. To streamline this, you might consider building a custom chatbot. But isn’t building a chatbot complicated? No, not at all—especially with Rasa’s help.

In this blog, we’ll guide you through the process of building and deploying a sample Rasa chatbot on an AWS EKS (Elastic Kubernetes Service) cluster using Helm. Rasa, a powerful open-source conversational AI platform, allows you to create highly customizable chatbots, while Helm simplifies the deployment process by managing Kubernetes applications with ease.

We will walk you through Helm charts, which streamline and automate the deployment process, and show you how to set up an AWS EKS cluster specifically for your Rasa chatbot. Additionally, we’ll configure AWS Load Balancers to connect your chatbot to customers, ensuring it’s publicly accessible and ready to handle inquiries efficiently.

This guide will equip you with the necessary tools, AWS configurations, and knowledge to bring your chatbot to life, regardless of your familiarity with Rasa. So grab your coding toolkit, your AWS credentials, and your curiosity, and let’s turn your chatbot vision into reality with Rasa and Helm!

Prerequisites:

Understand the fundamental principles of Rasa, including intents, entities, and slots.
Understand the Rasa action server‘s fundamentals.
Familiarity with Helm and AWS Elastic Kubernetes Service (EKS)

Target architecture

What is Rasa?

Rasa is an open-source conversational AI platform that allows developers to build chatbots and voice assistants. It provides a framework for natural language understanding (NLU) and dialogue management and integrates with a wide range of messaging channels and third-party systems. It also allows the user to train the model and add custom actions.

What is Helm Charts?

Helm Charts is an open-source package manager for Kubernetes that helps you manage the deployment, configuration, and lifecycle of your Kubernetes applications. It simplifies the installation, upgrade, and sharing of Kubernetes applications across various environments by packaging a Kubernetes application and all its dependencies into a single unit.

The application we are going to build is a simple Rasa application that uses Rasa open-source image. The Rasa action server, handling the custom actions for our application, will receive the request from the application (you can skip this step if you just want to deploy a Rasa server). We provide a concise overview of the deployment process below.

Overview of the deployment process

Install Required Packages: Make sure you have the necessary tools and dependencies for building and deploying the Rasa application.
Create an AWS EKS Cluster: Utilize the provided CloudFormation template (CFT) to establish an AWS EKS cluster, which will serve as the foundation for deploying your Rasa application.
Set Up a Rasa Deployment Environment: Set up the EKS cluster to establish a dedicated Rasa deployment environment. This environment will provide you with the necessary resources to run your Rasa application smoothly.
Deploy the Rasa Action Server (Optional): If your application requires custom actions, deploy the Rasa action server within the EKS cluster. The action server will be in charge of executing these custom actions.
Deploy the Rasa Application: Once the environment is ready, deploy the Rasa application onto the EKS cluster. This will make your Rasa application accessible and ready to process user interactions.

By following these steps, you can successfully deploy your Rasa application on AWS EKS, allowing you to leverage Rasa’s power to create engaging and effective conversational AI experiences.

Install required packages

To install Kubectl, helm, and eksctl in your local environment, use the following command in your terminal.

Download and Install kubectl

# Download the kubectl binary
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
 
# Apply execute permissions to the binary
chmod +x ./kubectl
 
# Add kubectl to the path
mkdir -p $HOME/bin && cp ./kubectl $HOME/bin/kubectl
 
# Add the export to bash profile
echo 'export PATH=$HOME/bin:$PATH' >> ~/.bashrc
 
# Execute the contents of bashrc which will add the kubectl to the path
source ~/.bashrc
 
# Verify installation
kubectl version --short --client

Download and Install eksctl

# Download the eksctl compressed file and extract it to /tmp
sudo curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp
 
# Move the extracted eksctl binary to the /usr/local/bin directory
sudo mv /tmp/eksctl /usr/local/bin
 
# Verify installation
eksctl version

Download & Install helm

# Download the helm compressed file
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
 
# Change permission
chmod 700 ./get_helm.sh
 
# Install bash
DESIRED_VERSION=v3.8.2 bash get_helm.sh

Create an AWS EKS cluster using the Cloud Formation template

On your local machine, create a yaml file called cluster.yaml. Copy and paste the following content into it. Please change the configuration like region, vpc Id, subnet Id, instanceType, and addons as per your requirement.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: 
  region: us-east-1
  version: '1.28'
  annotations:
    kubernetes.io/ingress.class: alb
vpc:
  id: 
  subnets:
    private:
      us-east-1a:
        id: 
      us-east-1b:
        id: 
privateCluster:
  enabled: false
  skipEndpointCreation: true
nodeGroups:
  - name: 
    labels: { role: workers }
    instanceType: m5.2xlarge
    desiredCapacity: 2
    volumeSize: 80
    privateNetworking: false
 
addons:
  - name: aws-ebs-csi-driver
    version: latest
  - name: aws-vpc-cni
    version: latest
  - name: coredns
    version: latest
  - name: kube-proxy
    version: latest
cloudWatch:
  clusterLogging:
    enableTypes:
      - all
    logRetentionInDays: 7

Run the command eksctl create cluster -f cluster.yaml to create the cluster.

Create an environment for Rasa deployment

# Create a new namespace
kubectl create namespace 

# Install AWS LoadBalancer controller (Note: This is only mandatory for the provisioning of internal load balancer).
curl https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.4.7/docs/install/iam_policy.json -o aws-load-balancer-controller-policy.json

# Create an IAM Policy using the downloaded policy.json which will be used by the service account
aws iam create-policy \
    --policy-name RasaLoadBalancerControllerIAMPolicy \
    --policy-document file://aws-load-balancer-controller-policy.json

# Create a service account - It will give an identity to your Pods
eksctl create iamserviceaccount \
  --cluster= \
  --namespace= \
  --name=aws-load-balancer-controller \
  --role-name RasaEKSLoadBalancerControllerRole \
  --attach-policy-arn= \
  --approve
 
# Add EKS Helm chart repo to local Helm client.
helm repo add eks https://aws.github.io/eks-charts

# will fetch the chart index files
helm repo update eks

# Install the controller with the service account that is created in the above step
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
  -n  \
  --set clusterName= \
  --set serviceAccount.create=false \
  --set serviceAccount.name=aws-load-balancer-controller
 
# verify the deployment - get all deployment details in the given namespace
kubectl get deployment -n

Deploy the Rasa action server in the EKS cluster (optional)

What is the Rasa action server?

A Rasa action server enables us to execute custom actions for a Rasa Open Source conversational assistant. Consider the Rasa action server to be the backend API for your Rasa application. When your Rasa assistant predicts a custom action (action_hello_world), the Rasa server sends a POST request to the action server with a JSON payload that includes the name of the predicted action, the conversation ID, the tracker contents, and the domain contents.

The Rasa server then returns the user’s responses and adds the events to the conversation tracker. Use the below-given Python code or the sample rasa action server code provided by the Rasa team.

class ActionHelloWorld(Action):

    def name(self) -> Text:
        return "action_hello_world"

    def run(self, dispatcher: CollectingDispatcher,
            tracker: Tracker,
            domain: Dict[Text, Any]) -> List[Dict[Text, Any]]:

        dispatcher.utter_message(text="Hello World!")

        return []

To deploy the basic Rasa action server, create a rasa-action-values.yaml file. Use Docker to deploy the Rasa action server separately, then install it in the EKS cluster using the provided rasa-action-values.yaml.

applicationSettings:
  port: 5055
  scheme: http

replicaCount: 1

image:
  name: 
  tag: "1.0"
  repository: ""
  pullPolicy: Always

service:
  type: NodePort
  port: 80

ingress:
  enabled: false

Save the file, and then use the helm command to deploy the Rasa action server.

# Add helm rasa repo
helm repo add rasa https://helm.rasa.com

# Install or Upgrade the rasa action server
helm upgrade --install --namespace  --values rasa-action-values.yaml  rasa/rasa-action-server

In the given namespace, use the following command to get the name of the deployed services for the Rasa action server.

 kubectl get svc -n <namespace_name>

To create an ingress resource for your action server, create a file named rasa-action-ingress.yaml.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  namespace: rasa
  name: 
  annotations:
    alb.ingress.kubernetes.io/scheme: internal
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/subnets:  
    kubernetes.io/ingress.class: alb
    #By default the ingress(ALB) will be open to all ports and IPs. To connect with only the rasa application give VPC CIDR ips in which the eks cluster resides.
    alb.ingress.kubernetes.io/inbound-cidrs: 
spec:
  ingressClassName: alb
  rules:
    - http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: 
              port:
                number: 80

For the rasa action server, create an ingress resource using helm and the above-given yaml.

# Create an ingress (internal ALB) for rasa action
kubectl apply -f rasa-action-ingress.yaml

# get the internal action server ALB hostname (this URL is needed)
kubectl get -f rasa-action-ingress.yaml -o jsonpath="{.status.loadBalancer.ingress[0].hostname}"

Deploy the Rasa application in the EKS cluster

To deploy the basic Rasa action server, create a rasa-values.yaml file.

applicationSettings:
  debugMode: true
  trainInitialModel: true
  port: 5005
  scheme: http
  enableAPI: true
  credentials:
    enabled: true

replicaCount: 1
autoscaling:
  enabled: false

serviceAccount:
  create: false

ingress:
  enabled: true
  annotations:
    alb.ingress.kubernetes.io/subnets: 
  # remove the below-given section if not needed
  tls:
    - secretName: chart-example-tls
      hosts:
        - chart-example.local
  pathType: Prefix
  path: /
  hostname: "chart-example.local"

command: ["/bin/sh", "-c", "rasa run --cors * --enable-api -vv"]

registry: docker.io/rasa
image:
  name: rasa
  tag: "3.2.6"
  pullPolicy: Always

service:
  type: NodePort
  port: 443 # can change it to 80 for HTTP and remove the tls section from ingress if you don't want to deploy your app in HTTPS(443)

# Avoid the below-given code if you are not using the rasa-action server
rasa-action-server:
  install: false
  external:
    enabled: true
    url: "/webhook"
action:
  endpointURL: "/webhook"

# Install or Upgrade the rasa application (rasa/rasa helm chart) using rasa-values 
helm upgrade --install --namespace rasa --values rasa-values.yaml  rasa/rasa

Create an ingress resource for the Rasa application using the following yaml file.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  namespace: rasa
  name: 
  annotations:
    alb.ingress.kubernetes.io/scheme: external
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/subnets: 
    kubernetes.io/ingress.class: alb
spec:
  ingressClassName: alb
  rules:
    - http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: 
              port:
                number: 80

Save the file and deploy the ingress resource.

# Create an ingress (internal ALB) for rasa
kubectl apply -f rasa-ingress.yaml

# get the rasa internal ALB hostname
kubectl get -f rasa-ingress.yaml -o jsonpath="{.status.loadBalancer.ingress[0].hostname}"


# some useful commands
kubectl get all -n  # List all resources in the given namespace
kubectl delete ingress  -n  # delete the ingress resource in the given namespace
kubectl describe ingress  -n  # delete the ingress resource in the given namespace
kubetcl get logs -f  -n  # Get the logs of the given pod
kubectl describe  -n  # Gives the details of the given pod
kubectl logs  -n  -c  # Get the logs of the container in the given pod

Now that we have our Rasa application, we can connect it to our Rasa action server. Use Postman to test the connection, or use the curl command in your terminal.

Conclusion

Voila! We have successfully built our first chatbot. In this blog, we walked through the process of creating a chatbot using Helm charts, deploying the application in an EKS (Elastic Kubernetes Service) cluster, and exposing it to the public through an AWS load balancer.

Additionally, you can easily integrate the chatbot into any application or use it as a standalone service. You can interact with it through its API, which opens up a wide range of possibilities for further development and customization.

This setup ensures high availability, flexibility, and the ability to scale as your chatbot grows. You now have the foundational architecture to support future improvements, whether your plans involve extending its functionality, integrating it with other services, or enhancing its AI capabilities.

With the deployment in place, you are ready to expand this chatbot’s use cases and continue enhancing its capabilities. Happy coding!

Reference

Bharanidharan Boopalan

+ posts

Cookie	Duration	Description
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other".
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
Zoominfo	session	Zoominfo uses technologies to collect and store information when you interact with services it offer to their partners, such as advertising services or analytics. All of those processes are meant to improve your user experience and the overall quality of our services.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_111355416_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_hjAbsoluteSessionInProgress	30 minutes	This cookie is used to detect the first pageview session of a user. This is a True/False flag set by the cookie.
_hjFirstSeen	30 minutes	This is set by Hotjar to identify a new user’s first session. It stores a true/false value, indicating whether this was the first time Hotjar saw this user. It is used by Recording filters to identify new user sessions.
_hjid	1 year	This is a Hotjar cookie that is set when the customer first lands on a page using the Hotjar script.
_hjIncludedInPageviewSample	2 minutes	This cookie is set to let Hotjar know whether the user is included in the data sampling defined by site's pageview limit.
_hjIncludedInSessionSample	2 minutes	This cookie is set to let Hotjar know whether the user is included in the data sampling defined by site's daily session limit.
_hjTLDTest	session	Hotjar test cookie to check the most generic cookie path it should use, instead of the page hostname. This is done so that cookies can be shared across subdomains (where applicable). To determine this, we store the _hjTLDTest cookie for different URL substring alternatives until it fails. After this check, the cookie is removed.
oktgid	1 year	This cookie is used for storing the visitor ID of the user who clicked on an okt.to link.
oktsid	session	This cookie is used for storing the session ID of the user who clicked on an okt.to link.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by YouTube and is used to track the views of embedded videos on YouTube pages.

Cookie	Duration	Description
__gwtCookieCheck	session	This cookie is used to check if the visitors' browser supports cookies.
AnalyticsSyncHistory	1 month	These cookies are used to deliver advertisements more relevant to you and your interests. They are also used to limit the number of times you see an advertisement as well as help measure the effectiveness of the advertising campaign. They remember that you have visited a website and this information is shared with other organizations such as advertisers.
li_gc	2 years	These cookies are used to deliver advertisements more relevant to you and your interests. They are also used to limit the number of times you see an advertisement as well as help measure the effectiveness of the advertising campaign. They remember that you have visited a website and this information is shared with other organizations such as advertisers.
UserMatchHistory	1 month	LinkedIn - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.