Chatbot based on OpenAI’s GPT-4

Author

Nenad Bozinovic

Published

2023-11-24

Using gpt-4-1106-preview APIs I created some role-playing chatbots. Frontend: Streamlit hosted on EC2, backend: AWS Lambda custom EC2 instance. CI/CD by GitHub Actions.

Repo

Demo

2x speed

Notable technical points

  1. OpenAI interface is via client and API key (pip library is openai)

  2. Chat is a list of messages, each message is a dictionary with role (user or bot) and content (text). For example this is how we assign a system role, i.e. a type of assistant you want to have:

chat = [{"role": "system", "content": 'You are software engineer'}]
  1. To get a reply from the assistant, we use client.chat.completions.create method. The model is gpt-4-1106-preview:
def question(chat_history, some_question, client=client):
        """We take a chat_history, append a question as a user, then get a reply from the assistant, and append that too

        Args:
            chat_history (list): A list of dictionaries, with each dictionary containing a role and content key
            some_question (string): 
        """
        chat_history.append({"role": "user", "content": some_question})
        reply = client.chat.completions.create(
                model="gpt-4-1106-preview",
                messages=chat
                )
        reply_message = reply.choices[0].message
        chat_history.append({'role': reply_message.role, 'content':reply_message.content})
        display(Markdown(reply_message.content))
  1. I am using AWS Lambda to make API calls, for that I:
    • create a layer with the dependencies (I had to use 1.10.12 version of pydantic as new one is having issues). Also note that Lambda has issues with some native libraries per this, so make sure to install packages using this command:
    pip install --platform manylinux2014_x86_64 --target=package --implementation cp --python-version 3.x --only-binary=:all: --upgrade <package_name> -t ./theEnvFolder/python
    • create an IAM role
    • finally create Lambda function (I am handling both options of using requests and boto3 since they have different event)
import json
from openai import OpenAI
import os

def lambda_handler(event, context):
    """
    event is a same as chat i.e. list of dictionaries with role and content (in a case when called via URL it is slightly different since it is embedded in the extra layer so we extract body first)
    """
    
    api_key = os.getenv("OPENAI_API_KEY")
    client = OpenAI(api_key=api_key)
        
    if 'body' in event:  # this is needed when calling Lambda via URL, since URL call and boto3 have different events
        event = json.loads(json.loads(event.get('body')))
        
    chat_history = event
        
    reply = client.chat.completions.create(
        model="gpt-4-1106-preview",
        messages=chat_history,
        max_tokens=500
        )
    
    reply_message = reply.choices[0].message
    chat_history.append({'role': reply_message.role, 'content':reply_message.content})

    return {
        'statusCode': 200,
        'body': event
    }

Conclusion

We have shown requests and boto3 approaches to call OpenAI API hosted on AWS Lambda. I’ve hosted the app on both streamlit.app web platform and my own EC2 instance, however, it is not worth using the resources just for the demo. If this is to scale the proper way I would use AWS Beanstalk that takes care of load balancers and autoscaling web servers.

Appendix

I use OpenAI’s APIs for generating text and code (GPT-4) and images (DALL-E).

Import Modules and Packages

from openai import OpenAI
import pandas as pd
import requests
from datetime import datetime
from pprint import pprint
import tiktoken
from pypdf import PdfReader
from IPython.display import Markdown, display, Image
import os
from matplotlib import image as mpimg
from matplotlib import pyplot as plt

Set the API Key

api_key = os.getenv("OPENAI_API_KEY")
client = OpenAI(api_key=api_key)
pd.set_option('display.max_colwidth', None)

def pp(df):
    return display( df.style.set_properties(subset=['emails'], **{'text-align': 'left', 'white-space': 'pre-wrap', 'width': '900px'}) )

Generate Emails for Reviews

columns = ['reviews', 'emails']
df = pd.DataFrame(columns=columns)
df['reviews'] = [
    "Nice socks, great colors, just enough support for wearing with a good pair of sneakers.",
    "Love Deborah Harness's Trilogy! Didn't want the story to end and hope they turn this trilogy into a movie. I would love it if she wrote more books to continue this story!!!",
    "SO much quieter than other compressors. VERY quick as well. You will not regret this purchase.",
    "Shirt a bit too long, with heavy hem, which inhibits turning over. I cut off the bottom two inches all around, and am now somewhat comfortable. Overall, material is a bit too heavy for my liking.",
    "The quality on these speakers is insanely good and doesn't sound muddy when adjusting bass. Very happy with these.",
    "Beautiful watch face. The band looks nice all around. The links do make that squeaky cheapo noise when you swing it back and forth on your wrist which can be embarrassing in front of watch enthusiasts. However, to the naked eye from afar, you can't tell the links are cheap or folded because it is well polished and brushed and the folds are pretty tight for the most part. love the new member of my collection and it looks great. I've had it for about a week and so far it has kept good time despite day 1 which is typical of a new mechanical watch."
]
df.head()
reviews emails
0 Nice socks, great colors, just enough support for wearing with a good pair of sneakers. NaN
1 Love Deborah Harness's Trilogy! Didn't want the story to end and hope they turn this trilogy into a movie. I would love it if she wrote more books to continue this story!!! NaN
2 SO much quieter than other compressors. VERY quick as well. You will not regret this purchase. NaN
3 Shirt a bit too long, with heavy hem, which inhibits turning over. I cut off the bottom two inches all around, and am now somewhat comfortable. Overall, material is a bit too heavy for my liking. NaN
4 The quality on these speakers is insanely good and doesn't sound muddy when adjusting bass. Very happy with these. NaN

Let’s take each review and make an email. This email is going to: - Address the concerns expressed in the reviews. - Thank the customers for their purchase. - Encourage them to continue shopping.

chat = [{"role": "system", "content": "You are a polite customer support representative."}]

postfix = "\n\nWrite an email to customers to address the issues put forward in the above review, thank them if they write good comments, and encourage them to make further purchases. Do not give promotion codes or discounts to the customers."

def make_email(review):

    chat_history = chat.copy()
    chat_history.append({"role":"user", "content": review + postfix})

    reply = client.chat.completions.create(
        model="gpt-4",
        messages=chat_history
        )

    return reply.choices[0].message.content

Generate Python Code

problems = [
    "largest merge of two strings",
    "sum of unique elements",
    "longest palindrome",
    "all possible permutations of a string",
]
chat = [{"role": "system", "content": "You are a software engineer for Python."}]

prefix = "\n\nWrite code that will solve the problem: "

def solve(problem):

    chat_history = chat.copy()
    chat_history.append({"role":"user", "content": prefix + problem})

    reply = client.chat.completions.create(
        model="gpt-4",
        messages=chat_history
        )

    return reply.choices[0].message.content
Markdown(solve(problems[1]))

Here is how you can write a Python function to calculate the sum of unique elements in a given list.

def sum_of_unique_elements(lst):
    return sum(set(lst))

# test the function
print(sum_of_unique_elements([1,2,3,3,4,4,5,6,7,8,8,9,10]))

In the function sum_of_unique_elements(lst), set(lst) is used to remove duplicates from the list because sets cannot have duplicate elements. Then, sum(set(lst)) returns the sum of unique elements.

For example, if you run the printed test function with a list [1,2,3,3,4,4,5,6,7,8,8,9,10], it will return 55 because the sum of the unique elements (1,2,3,4,5,6,7,8,9,10) is 55.

Extract Text From a PDF

GPT-4 supports context plus response of up to 8192 tokens (tokens are encoded words into numbers):

def num_tokens_from_string(string, encoding_name):
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens
# url = 'https://arxiv.org/pdf/2312.06272.pdf'
# a = requests.get(url)

# with open("segformer.pdf", 'wb') as f:
#     f.write(a.content)
reader = PdfReader("SpaceNet8_final_paper.pdf")
text = ""
for page in reader.pages:
    text += page.extract_text() + "\n"
num_tokens_from_string(text, 'cl100k_base')
7705

Let’s trim it down a little bit:

text2 = text[:int(0.8*len(text))]
num_tokens_from_string(text2, 'cl100k_base')
5860

That should be good enough.

Summarize the Text

chat = [{"role": "system", "content": "You are a machine learning researcher that writes blogs about other people research that simplifies machine learning concepts, but does not dumb it down totally."}]

prefix = "\n\nSummarize the following paper:"

def summarize(text):

    chat_history = chat.copy()
    chat_history.append({"role":"user", "content": prefix + text})

    reply = client.chat.completions.create(
        model="gpt-4",
        messages=chat_history
        )

    return reply.choices[0].message.content
option2 = summarize(text2)
Markdown(option2)

The research paper, “Comparing Transformers and CNNs on the SpaceNet Flood Detection Challenge,” is an exploration of different transformer and convolutional neural network (CNN) segmentation architectures in detecting floods caused by hurricanes and heavy rains. The research was done in the context of SpaceNet8 Challenge.

The study tested various models including Transformer and U-Net models. It found that large pre-trained Segformer models performed better than the Resnet and U-Net based models. The highest Intersection-over-Union (IoU) was 61% for Segformer, suggesting that attention mechanisms are better suited for detecting building footprints.

The research also found flood detection, especially flooded road detection, to be challenging, with the highest IoU of 40%. Further, it was inferred that the pre-training on ImageNet and Cityscapes datasets improved the model’s performance moderately compared to pre-training on the ADE20k dataset and significantly over model training from scratch.

The researchers leveraged SpaceNet 8 dataset which includes pre-event images and post-event images. The model designated as the Foundation Features network used pre-event images to segment buildings and roads, whereas the Flood network used both pre- and post-event images to predict flood status.

The paper also comments on the differences in memory consumption and epoch durations across different models, noting that Segformer models consumed more memory and had longer epochs despite having fewer parameters compared to Resnet34. This is attributed to the attention mechanisms having a quadratic complexity. The study also highlights the impact of data storage and access methods in computational efficiency.

In conclusion, the Segformer model, which leverages Transformer, exhibits better performance than CNN-based models (Resnet and U-Net) in the context of the SpaceNet Flood Detection Challenge. However, the paper suggests further improvements might be achieved through normalizing images, applying pre-processing techniques, and leveraging more diverse training data.

Generate Images

response = client.images.generate(
    model='dall-e-3',
    prompt="An scene of a majestic snow covered mountain with cliffs. In the not too far distance, \
        a cool male skier adorned in brightly colored winter gear is jumping of the cliff, \
        He has a ripping jet engine securely fastened to his back, its powerful gusts creating an impressive display.\
        He has skip poles in his hands, and is wearing a helmet and goggles. \
        He has skies with rocketed tips, \
        He is captured mid-jump, soaring over a steep, snow-covered cliff. \
        Camera angle should be from the side, about 30 degrees from the skier, and he should be in fact smaller, less then 20 percent of the image. \
        The pristine winter setting is visible beneath him, with majestic snow-capped peaks and a valley blanketed in white. ",
    size='1792x1024',
    quality='hd',
    n=1
)
display(Markdown(response.data[0].revised_prompt))
image_url = response.data[0].url
path='usercode/images'
os.makedirs(path, exist_ok=True) 

name = path+'/'+str(datetime.now())

img_data = requests.get(image_url).content
with open(name+'.jpg', 'wb') as handler:
    handler.write(img_data)
    
plt.figure(figsize=(11,9))
img = mpimg.imread(name+'.jpg')

imgplot = plt.imshow(img)
imgplot.axes.get_xaxis().set_visible(False)
imgplot.axes.get_yaxis().set_visible(False)
plt.show()
import time
import sys

def type_like_a_person(text, delay=0.005):
    for char in text:
        sys.stdout.write(char)
        sys.stdout.flush()
        time.sleep(delay)
    print()  # Move to the next line after the message is complete

response = """
The research paper, "Comparing Transformers and CNNs on the SpaceNet Flood Detection Challenge," is an exploration of different transformer and convolutional neural network (CNN) segmentation architectures in detecting floods caused by hurricanes and heavy rains. The research was done in the context of SpaceNet8 Challenge.

The study tested various models including Transformer and U-Net models. It found that large pre-trained Segformer models performed better than the Resnet and U-Net based models. The highest Intersection-over-Union (IoU) was 61% for Segformer, suggesting that attention mechanisms are better suited for detecting building footprints.

The research also found flood detection, especially flooded road detection, to be challenging, with the highest IoU of 40%. Further, it was inferred that the pre-training on ImageNet and Cityscapes datasets improved the model's performance moderately compared to pre-training on the ADE20k dataset and significantly over model training from scratch.

The researchers leveraged SpaceNet 8 dataset which includes pre-event images and post-event images. The model designated as the Foundation Features network used pre-event images to segment buildings and roads, whereas the Flood network used both pre- and post-event images to predict flood status.

The paper also comments on the differences in memory consumption and epoch durations across different models, noting that Segformer models consumed more memory and had longer epochs despite having fewer parameters compared to Resnet34. This is attributed to the attention mechanisms having a quadratic complexity. The study also highlights the impact of data storage and access methods in computational efficiency.

In conclusion, the Segformer model, which leverages Transformer, exhibits better performance than CNN-based models (Resnet and U-Net) in the context of the SpaceNet Flood Detection Challenge. However, the paper suggests further improvements might be achieved through normalizing images, applying pre-processing techniques, and leveraging more diverse training data."  # Replace with the actual API response

"""

type_like_a_person(response)