Back to Blog

Building a biography chatbot using the Python OpenAI Assistants API

17 Nov 2023 • 2 min read • 3 mins coding time

Mike Gee







Today we're building a website chatbot for the Andrew Carnegie Biography using the OpenAI Assistants API and Web Transpose Crawl.


Getting the Biography Website Data

To get the website data we're going to use Web Tranpsose Crawl. It's free to get 100 websites of data so you can create a free account here. Once you're signed in your can get your API Key here.

import os import webtranspose as webt os.environ['WEBTRANSPOSE_API_KEY'] = 'YOUR_API_KEY'

Now, we wanna get the website data. Let's scrape the website using Web Transpose Crawl.

crawl = webt.Crawl( url='https://www.gutenberg.org/files/17976/17976-h/17976-h.htm', max_pages=1, verbose=True, ) await crawl.crawl()

Upload the data to OpenAI

Now, let's upload the data to OpenAI.

import io page = crawl.get_page(crawl.base_url) file = client.files.create( file=io.BytesIO(page['text'].encode('utf-8')), purpose="assistants", )

Create the OpenAI Assistant

Now, let's create the OpenAI assistant

from openai import OpenAI client = OpenAI() instructions = """You are a helpful assistant that takes the user's query, searches through its uploaded files to get more context and answers and then only using information from the context. """ assistant = client.beta.assistants.create( name="Talk to Andrew Carnegie's Biography", instructions=instructions, model="gpt-4-1106-preview", tools=[ {"type": "retrieval"}, ], file_ids=[ file.id, ], )

Chat with your assistant

You first need to define some functions to submit messages and wait on the run to complete.

import time def submit_message(assistant_id, thread, query): message = client.beta.threads.messages.create( thread_id=thread.id, role="user", content=query, ) return client.beta.threads.runs.create( thread_id=thread.id, assistant_id=assistant_id, ) def wait_on_run(run, thread): while run.status == "queued" or run.status == "in_progress": run = client.beta.threads.runs.retrieve( thread_id=thread.id, run_id=run.id, ) time.sleep(0.5) return run

Now, we can chat with our assistant.

query = "Tell me your thoughts on Pittsburgh" thread = client.beta.threads.create() run = submit_message( assistant.id, thread, query, ) run = wait_on_run(run, thread)

View the Replied Message

Once the above code is finished running you can view your chatted messages:

messages = client.beta.threads.messages.list(thread_id=thread.id)

Now, let's print these messages out:

for msg in messages.data[::-1]: print(msg.role) print(msg.content[0].text.value) print('----')

All done!